You are here

Blogs

Dataflow tracker

Just added to my generic tracer a module which I can call "dataflow tracker".

This is a module which would be able to answer the question "where each received from network byte is RIGHT NOW?"

It's state is far from release-state, so I can't publish it yet.

But how it works is extremely simple. When function like socket recv() is called and it received some data chunk from network, dataflow tracker (dt) marking each byte in memory buffer in form:

Strings in Oracle RDBMS network layer

Not sure if it's worth blogging...

All strings in Oracle RDBMS network layer are usual C-strings terminated by zero byte, but often, string length is also passing as a separate function argument.
This makes some things much faster.
* strlen() is not necessary anymore - just take string length you already have.
* strcat() do not need to calculate string lengths.
* strcmp() against const string is working much faster:

Instead of:

Oracle/SAP exploits/PoCs

By the way, is there anyone reading this blog from information security companies who also would like to pay for Oracle and/or SAP exploits/PoCs? I have couple...

ops_SIMD 0.3

Here is my third version of (so far) known fastest Oracle RDBMS hash cracker (DES based hashes):
http://conus.info/utils/ops_SIMD/

Generic tracer 0.5 beta

Generic Tracer 0.5 beta is published for testing.

Among fixes and one small feature (see changelog.txt file), major feature I added is TRACE.

TRACE: trace each instruction in function and collect all interesting values from registers and memory. After execution, all that information is saved to process.exe.idc, process.exe.txt, process.exe_clear.idc files. .idc-files are IDA scripts, .txt file is grepable by grep, awk and sed.

For example, let's take add_member function from Using Uninitialized Memory for Fun and Profit article:

int dense[256];
int dense_next=0;
int sparse[256];

void add_member(int i)
{
	dense[dense_next]=i;
	sparse[i]=dense_next;
	dense_next++;

};

int main ()
{
	add_member(123);
	add_member(5);
	add_member(71);
	add_member(99);
}

Let's compile it and run tracing on add_member function (determine function address in IDA before):

gt -l:trace_test4.exe bpf=0x00401000,trace

We'll get trace_test4.exe.txt file:

0x401000, e=       4
0x401001, e=       4
0x401003, e=       4, [0x403818]=0..3
0x401008, e=       4, [EBP+8]=5, 0x47('G'), 0x63('c'), 0x7b('{')
0x40100b, e=       4, ECX=5, 0x47('G'), 0x63('c'), 0x7b('{')
0x401012, e=       4, [EBP+8]=5, 0x47('G'), 0x63('c'), 0x7b('{')
0x401015, e=       4, [0x403818]=0..3
0x40101a, e=       4, EAX=0..3
0x401021, e=       4, [0x403818]=0..3
0x401027, e=       4, ECX=0..3
0x40102a, e=       4, ECX=1..4
0x401030, e=       4
0x401031, e=       4, EAX=0..3

e field in how many times was executed this instruction.

Let's execute trace_test4.exe.idc script in IDA and we'll see:

Now it is much simpler to understand how this function work during execution.

Executed instructions are highlighed by blue color. Not-executed instructions are leaved white.

If you need to clear all comments and highlight, execute trace_test4.exe_clear.idc script.

All collected information in IDA-script may be reduced to shorten form like EAX=[ 64 unique items. min=0xbca6eb7, max=0xffffffed ] (because IDA has comment size limitation). On contrary, everything is saved to text file without shortening, that is why resulting text file may be sometimes pretty big.

One problem of TRACE feature that it is slow, however, functions from system DLLs are skipped (system DLL is that DLL residing in %SystemRoot%) Another problem is that things like exceptions, setjmp/longjmp and other unexpected codeflow alterations are not correctly handled so far.

One more problem is that this feature is only available in x86 (because only x86-disassembler currently present in gt project)

More examples: http://conus.info/gt/gt05beta/manual/gt.html#bpf_ex_trace

Download gt executables, source code and manuals: http://conus.info/gt/gt05beta/gt05beta.rar

Making C compiler generate obfuscated code

A customer of mine asked whether it is possible to protect his software from reverse engineering. I didn't found any C/C++ compiler which was able to produce obfuscated code making it hard to reverse engineer and complicate the use of such tools as Hex-Rays Decompiler, so I made a little attempt to hack Tiny C compiler's codegenerator.

I patched it so it produces a lot of random noise code between effective code. Of course, resulting code will work much slower. But in real life, we can obfuscate only critical parts of code containing algorithms we don't want to be easily leaked. Of course, it is virtually impossible to protect any code from reverse engineering, but it is possible to make it much more difficult.

Example: simple function:

int a (int a, int b)
{
	return a + b * 4;
};

On output...

a               proc near

var_CD500B      = byte ptr -0CD500Bh
arg_0           = dword ptr  8
arg_4           = dword ptr  0Ch
arg_1D364BDE    = byte ptr  1D364BE6h

                nop
                push    ebp
                mov     ebp, esp
                sub     esp, 0
                nop
                xor     eax, ebx
                mov     eax, 99B7A34Ah
                mov     eax, 0EC06E7ACh
                lea     edx, [esi+63h]
                mov     ebx, [ebp+arg_0]
                and     ebx, ebx

loc_800001F:
                lea     ebx, [ebp+arg_1D364BDE]
                mov     ebx, 9EF81F3Eh
                lea     eax, [ebx+3Eh]
                lea     ecx, [esi]
                mov     eax, 0FD6D5D47h
                sub     ebx, edx
                lea     ecx, [ebp+var_CD500B]
                lea     ecx, [eax]
                mov     eax, [ebp+arg_4] ; *
                shl     eax, 2          ; *
                mov     ecx, eax
                adc     ecx, edx
                mov     ecx, [ebp+arg_0]
                adc     ecx, ecx
                sub     edx, ecx
                sub     edx, eax
                lea     ebx, [esp+ecx*8]
                mov     ecx, 29262C66h
                mov     ebx, 0CC18D2C4h
                mov     ebx, 0FDB56490h
                mov     ecx, 9E709D5Eh
                mov     ecx, 73805EBFh
                mov     ecx, eax
                or      ecx, eax
                mov     ebx, 7339AD0Eh
                mov     edx, 2CA8725Ah
                lea     edx, [edi+esi*8]
                mov     ebx, 87684A89h
                mov     ebx, 52A74759h
                xor     edx, edx
                jnz     short loc_800001F
                mov     ebx, 0CCA90613h
                sub     ecx, eax
                mov     ecx, 0C6699FDh
                mov     ebx, 0A8B272A1h
                mov     ebx, eax
                sbb     ebx, ebx
                mov     ecx, [ebp+arg_0] ; *
                add     ecx, eax        ; *
                or      edx, ebx
                mov     edx, 47257B14h
                mov     edx, ecx
                add     edx, edx
                mov     eax, 9E3E878Ah
                mov     ebx, 0DAB5E429h
                mov     edx, 0ABFDB94Eh
                adc     eax, ebx
                add     edx, ebx
                lea     edx, [ebx+75A1EF29h]
                or      edx, edx
                mov     eax, ecx        ; *
                jmp     $+5
                leave
                pop     ebx
                jmp     ebx
a               endp

(effective code marked with asterisk)

One funny thing is that now the compiler uses random number generator. Almost all good computer programs contain at least one random-number generator. (fortune file in plan 9 OS).

Here is also my crackme I created for testing. It was eventually reversed, though.

For those who are interested:

Patch for Tiny C version 0.9.25

Full source code patched

Tiny C 0.9.25 patched win32 executables

Update: comp.compilers thread

Update2: I would do something like that for GCC compiler if someone would sponsor this job. -> dennis@conus.info

Oracle .msb files unpacker

.msb files are files that contain various messages, in compiled form.
For those who might be interesting in getting it unpacked, here is my utility with source code intended to unpack error numbers and messages from these files.

Download win32 binary + source code.

Need reverse engineering like this? -> dennis@conus.info

Adding old dongle support to DosBox

An emulation of old copy-protection dongle for DOS software can be implemented right in DosBox DOS emulator.

Here is my patches for DosBox 0.74, enabling it to support 93c46-based dongle:
http://conus.info/stuff/dosbox/dongle.cpp
http://conus.info/stuff/dosbox/dosbox.cpp.patch

At least old Rainbow Sentinel Cplus and MicroPhar are 93c46-based dongles.

93c46 memory chip contain 64*16 words. More on it here.

Source code is self-explanatory. What you need is to add dongle.cpp to project, patch dosbox.cpp, fill MEMORY array representing dongle memory. You may also need to change rewiring scheme between 93c46 and printer port. Wiring scheme may differ from dongle to dongle, but usually, DI (data input), SK (clock), CS (chip select) and power lines are taken from D0..D7 in some order. DO (data output) may be connected to ACK or BUSY printer lines.

Now how to read 93c46-based dongle? Get a free reader there (sread.zip):
http://safe-key.com/freesoftware.html
It produce crypted file, however, sread.exe can be patched (write 0xC3 byte at 0xE1B address) then unencrypted dump file will be created.

But what if you do not have a dongle to read information from it? First, take a look on log messages: which cells are reading by your software? Try 0x6669 here, for example. Compile DosBox with heavy debug option and produce all instructions and register's state executed:
http://blogs.conus.info/node/55

... then just grep LOGCPU.TXT for 6669: value from dongle is probably compared with some other constant, however, it is not rule, things may be much more complex.

Using debugging features of DosBox

DosBox is DOS emulator, one can say, it is a kind of virtual machine, mainly used for retrocomputing and retrogaming.
One interesting feature of DosBox compiled with "heavydebug" option is built-in disassembler, not very powerful, but it can log every instruction it executes with full registers' states.

Load your old DOS software by typing "DEBUG program.exe" in command line, and debugger will be activated. Type "LOG 100000": it means to run program and log 100,000 executed instructions. All of them are dumped into LOGCPU.TXT file:

2AF4:0000005A  mov  dx,ds EAX:00000000 EBX:00002AF4 ECX:00000004 EDX:00001AB0 ESI:0000FFF2
EDI:0000FFD2 EBP:00000000 ESP:00000080 DS:0FBF ES:1AB7 FS:0000 GS:0000
SS:2C6E CF:1 ZF:1 SF:0 OF:0 AF:4 PF:4 IF:1

2AF4:0000005C  sub  dx,ax EAX:00000000 EBX:00002AF4 ECX:00000004 EDX:00000FBF ESI:0000FFF2
EDI:0000FFD2 EBP:00000000 ESP:00000080 DS:0FBF ES:1AB7 FS:0000 GS:0000
SS:2C6E CF:1 ZF:1 SF:0 OF:0 AF:4 PF:4 IF:1

2AF4:00000064  shl  ax,cl EAX:00000000 EBX:00002AF4 ECX:00000004 EDX:00000FBF ESI:0000FFF2
EDI:0000FFD2 EBP:00000000 ESP:00000080 DS:0FBF ES:1AB7 FS:0000 GS:0000
SS:2C6E CF:0 ZF:0 SF:0 OF:0 AF:0 PF:0 IF:1

What we got is relatively big text file (can be as big as couple of gigabytes), which can be easily parsed with grep, sed, AWK or whatever you like.
Let's get back to real-world task. I have a very old DOS program that requires access to very old piece of hardware, such as copy-protection dongle, and we need to get rid of it. Back to DOS days, these dongles were connected to LPT printer port. So what we know is that our DOS program is accessing it at least via port 0x378.
Let's run our program in DosBox without dongle and take all "out dx,al" instructions (writes to port) where EDX register state is port number 0x378.

cat LOGCPU.TXT | grep "out  dx,al" | grep "EDX:00000378"

We got something like:

1311:000002CA  out  dx,al EAX:00001212 EBX:00000000 ECX:00000001 EDX:00000378 ESI:00000378 
EDI:00007C2E EBP:00007BEC ESP:00007B7E DS:1311 ES:0000 FS:0000 GS:0000
SS:26E7 CF:0 ZF:0 SF:0 OF:0 AF:0 PF:4 IF:1

Wow, we see the places where program tries to access the dongle.
Now let's get deeper.
When our program can't find the dongle connected, it exits to DOS. We suppose it calls INT21 interrupt with AH=4C function code meaning "exit to DOS". But It is not common to call it right after IN/OUT instructions in the same function. There might be functions like "check for dongle presence", "read memory cell from dongle" at the top level. At the lowest level of dongle library there can be "write to dongle", "read from dongle", and "perform a delay so that slow dongle can respond".
We must find the function like "check for OUR dongle presence: read feature flags from it and get evidence it is ours, it is not expired, etc". Most often, when such functions fail, we see error messages like "dongle not connected", "invalid dongle", and program terminates.
I wrote a very small utility for LOGCPU.TXT file parsing. It tracks not only the place where IN/OUT instructions are executed, but also call stack at the moment. It also tracks all INT21 interrupts.
Here is what I got:

OUT  |EDX:00000378|9BD2:2FF 9D09:88 1A7:34 1A7:53 1A7:251 1A7:26F
1A7:42E 1A7:5A3 1A7:1B81 1A7:1B8C 1A7:1B31 1A7:1B3C 1A7:7E8 1A7:80B
786:147E 786:1992 9CA:375 9CA:118 9CA:1FE
...
IN   |EDX:00000379|9BD2:2FF 9D09:88 1A7:34 1A7:53 1A7:251 1A7:26F
1A7:42E 1A7:5A3 1A7:1B81 1A7:1B8C 1A7:1B31 1A7:1B3C 1A7:7E8 1A7:80B
786:147E 786:1992 9CA:375 9CA:118
...
INT21|EAX:00004C00|9BD2:2FF 9D09:88 1A7:34 1A7:53 1A7:251 1A7:26F
1A7:42E 1A7:5A3 1A7:1B81 1A7:1B8C 1A7:1B31 1A7:1B3C 1A7:7E8 1A7:80B
786:147E 786:1A1C 8ED4:160

The chain dumped is call stack at the moment when IN/OUT/INT21 was executed. Now all we need is to find most common chain part and find the most important moment where program decided to exit because dongle was not connected.

That moment is probably somewhere at the place where chain paths are diverged.

Pages

Subscribe to RSS - blogs