|
EPOC Mixed-Language ProgrammingAbstract Matt's notes on developing programs in C, Assembler and C++ for the EPOC platform. How to call one language from another. Synopsis of name mangling conventions. Table Of Contents
1 IntroductionThis project or information is dead. I cannot release any code or further information on it, other than what you'll find on this page. All communication regarding it will be silently deleted. Apologies.C++ is the "official" language in which one develops software for EPOC. Symbian's OS is OO from the ground up, and C++ calls to methods in DLLs are how the OS is accessed. There are also OPL and Java implementations for EPOC, but I won't be covering them here. Although Symbian provide a C STDLIB implementation, they stress that this isn't the "C API for EPOC". In some cases, C++ isn't sufficient. Some people claim that it is slow, and produces large programs. To some extent, this is true: virtual function lookups (used when calling member functions of objects that are not explicitly supplied, for example, when passed a pointer or reference to an object) can be expensive. This is one reason why inheritance is, IMHO, oversold. The fact that much of the code reused by an EPOC program is in ROM means that programs are liable to be small. Some people simply want to code in C, or assembler - it may be what they know; not everyone loves C++; they may need to write in assembler to make the processor go through hoops that the rigidity of a high-level language will not allow. My own case - developing a port of the hForth system for EPOC - means that I have to code in assembler: Forth engines can be coded in high-level languages (see GNU Forth (gforth) for an example), but are possibly larger than their assembler counterparts. There is (according to comments read on the EPOC World public newsgroups) a small amount of assembler in EPOC itself, mainly the bootstrap code. The rest of the assembler is embedded in C++ source, using the asm("...") directive. The Symbian build tool makmake does not support the building of assembler source, and so this embedded C++ method is the easiest to use. Debugging an EPOC program is usually done by building it under the WINS envrionment, using Microsoft Visual C++, and testing it in the emulator - a port of EPOC that sits atop Win32. If it works there, it'll probably work on MARM - the version that runs on the Series 5/5mx/7 hardware. Of course, this is of no use whatsoever to those who write assembler! Nor is it of any comfort to those who eschew Microsoft tools and operating systems. Development of MARM applications under alternative systems such as Linux is not covered here. Suffice it to say that it is possible, using some of the Symbian tools run under Wine, the Windows emulator. I am working on a GNU Remote Debug Stub for EPOC programs, so that they may be debugged remotely from a PC running the GNU debugger. This may help with debugging MARM software. How then, does one start up a program written in C or assembler for EPOC? There are two possibilities:
To do this requires knowledge of the differences between C, C++ and assembler
calling conventions, function naming, and return handling.
2 Calling language x from yThe possibilities are as follows:
3 Writing Assembler Using asm(...) DirectivesThis is the easiest way of writing assembler for EPOC. makmake will create a makefile that compiles a C or C++ source file, but it won't do assembler source.
The following observations were made by examining the output of gcc, when
compiling simple functions.
extern int func(); asm(" .text 0"); asm(" .align 0"); asm(" .globl func"); asm("func:"); asm(" mov r0, 5"); asm(" mov pc, lr");This simply returns the value 5. 3.2 Parameter passingA function that takes up to four 32-bit integers as its input will receive these in the r0, r1, r2, and r3 registers, so void func(int a, int b, int c, int d) recieves a in r0, b in r1, c in r2 and d in r3.If you pass more than four inputs, you have to start using the stack - the following C, and its assembler counterpart illustrate: static int j; void funky(int a, int b, int c, int d, int e) { j=a+b+c+d+e; return j*5; } .text .align 0 .global funky funky: stmfd sp!, {lr} ldr ip, [sp, #4] ldr lr, .L2 add r1, r0, r1 add r1, r1, r2 add r1, r1, r3 add r1, r1, ip str r1, [lr, #0] add r0, r1, r1, asl #2 ldmfd sp!, {pc} .L3: .align 0 .L2: .word j .bss j: .space 4 .textIf you write your assembler function in a C++ source file, as a class member function, then r0 will be this, with r0, r1, r2 holding the remaining parameters.
To test out the assembler version of a C function like this, simply write your C
as normal in a stand-alone .c file, and use the CCS.CMD script, as found in the
EPOC SDK. I have a slightly modified version of this which doesn't require the
use of the unix2dos utility, and doesn't remove the intermediate assembly
stages.
long long int func() { return 5; } .text .align 0 .global func func: mov r1, 0 mov r0, 5 mov pc, lr 3.4 Global and Static VariablesVariable access is performed indirectly. Space for the variable is declared in the .bss section (if the variable is not initialised) or the .data section (if the variable is initialised). If the variable is accessed, a pointer (in the .text section) is created which points to the variable's space; this is then used to access it:static int j=201; .data .align 0 j: .word 201 .text ... code ... ldr r3, .L2 str r0, [r3, #0] ... code ... .L3: .align 0 .L2: .word jFor an uninitialised variable: static int j; .bss .align 0 j: .space 4 .text ... code ... ldr r3, .L2 str r0, [r3, #0] ... code ... .L3: .align 0 .L2: .word jIf the variable is not declared static, a .globl variablename is used before the definition. (and an indirect reference generated, as above). 3.5 Register namingThe gas manual is unfortunately rather vague about the names it accepts for the ARM processor. From the gas source:/* Processor Register Numbers */ {"r0", 0}, {"r1", 1}, {"r2", 2}, {"r3", 3}, {"r4", 4}, {"r5", 5}, {"r6", 6}, {"r7", 7}, {"r8", 8}, {"r9", 9}, {"r10", 10}, {"r11", 11}, {"r12", 12}, {"r13", 13}, {"r14", 14}, {"r15", REG_PC}, /* APCS conventions */ {"a1", 0}, {"a2", 1}, {"a3", 2}, {"a4", 3}, {"v1", 4}, {"v2", 5}, {"v3", 6}, {"v4", 7}, {"v5", 8}, {"v6", 9}, {"sb", 9}, {"v7", 10}, {"sl", 10}, {"fp", 11}, {"ip", 12}, {"sp", 13}, {"lr", 14}, {"pc", REG_PC}, /* FP Registers */ {"f0", 16}, {"f1", 17}, {"f2", 18}, {"f3", 19}, {"f4", 20}, {"f5", 21}, {"f6", 22}, {"f7", 23}, {"c0", 32}, {"c1", 33}, {"c2", 34}, {"c3", 35}, {"c4", 36}, {"c5", 37}, {"c6", 38}, {"c7", 39}, {"c8", 40}, {"c9", 41}, {"c10", 42}, {"c11", 43}, {"c12", 44}, {"c13", 45}, {"c14", 46}, {"c15", 47}, {"cr0", 32}, {"cr1", 33}, {"cr2", 34}, {"cr3", 35}, {"cr4", 36}, {"cr5", 37}, {"cr6", 38}, {"cr7", 39}, {"cr8", 40}, {"cr9", 41}, {"cr10", 42}, {"cr11", 43}, {"cr12", 44}, {"cr13", 45}, {"cr14", 46}, {"cr15", 47}, 4 C++ Name ManglingIn C++, it is possible for the compiler to distinguish between function instances based on thier different parameter lists (e.g. there are definitions of func: func(int i) and func(double j), and depending on the argument given, the correct function is called). Such overloaded function definitions cause a problem for the linker which must resolve all such references in an unambiguous manner. The C++ compiler mangles the names of functions by appending coded information about their parameters to their names.The format of this code is not defined in the C++ specification; it is peculiar to the compiler, in this case, gcc, and if you're building software under the Emulator (WINS), then you're also relying on Visual C++'s name mangling rules. WINS is a port of EPOC, built using this compiler, and hence, Visual C++'s name mangling rules are hard-coded into the names of every function in the WINS libraries. It is for this reason that you cannot use any other C++ compiler to generate code for the Emulator. (And why you can't use any other compiler to generate code for MARM (why you'd want to use anything other than gcc anyway is a mystery to me!). The name mangling is transparent, if you're writing wholly in C++. If you write in C or assembler, it'll bite you. The best way I've found so far to discover a function's mangled name is to run CCS.CMD on the C++ source, then look through the resultant .s file. To defeat name mangling in a C++ program, define your functions as extern "C". I use this in wrappers, where a C program has to interface with a C++ program (e.g. the C/Asm GDB Stub code calls code in a C++ wrapper that is defined as extern "C". This code then calls C++ code in my C++ Serial I/O module.). I seem to recall trying to call C++ directly from C/Asm, calling its mangled name, but ran into linking problems. I'll re-investigate, for this document....
I'll try to provide a table of common mangled names in due course; until then,
CCS.CMD is your friend.
5 Accessing EPOCWhat is the mechanism by which an assembler/C program might call EPOC? Is it via a SWI (software interrupt instruction of the ARM processor) or some other mechanism? Initial investigations show that the ARM's exception vectors are all directed to the same handler. Need a disassembler/monitor running on the ARM...6 MiscWhat effect does __declspec(naked) have on compiled source? (from the gcc/config/arm/arm.c code:)
... to be continued
|