Thanks Brenda, Martin and Normand; On Thu, 2005-09-29 at 13:28 -0400, Brenda J. Butler wrote: > On Thu, Sep 29, 2005 at 01:05:17PM -0400, Martin Hicks wrote: > > > > On Thu, Sep 29, 2005 at 12:24:45PM -0400, William Case wrote: > > > > I am trying to develop an overview understanding of what happens when > > > parsing any little C program I have written. Where would I find the > > > rules that the parser uses to translate specific symbols? For example, > > > I would like to see exactly what happens with the "{ }" braces or the > > > ";" semi-colon when translated from source code to object code or > > > binary. But my question is not only about those two symbols alone. I > > Well, the gcc is the Gnu Compiler Collection. It has front ends for reading > and parsing programs (a front end for each language), back ends for spitting > out machine code (a back end for each architecture/os), and a middle section > that connects the front end and back end and does optimization. > > The middle section deals with the code in a language- and machine-independent > format. > > So, you won't see the gcc compiler converting a { into machine or > binary code. > > Have fun with that :-) > > ... > > > Wow. So this isn't just an easy question that someone can answer. What > > happens during parsing of a source language? Lots. Far too much to try > > to explain, but here's a little summary of stuff you should go look > > into. > > If you *really* want to know more about how compilers work you should: > Read the dragon book I have heard of the dragon book and I hope to peruse it in the near future. > Read o'reilly's Lex and Yacc More than I need to know I think. > Take an undergrad compilers course at university Me and universities will never mix. > Read the gcc sources and be prepared to welcome death I am ready to accept death soon. I have no intention of ever building a compiler, although the logic of using things like BNF or EBNF seems intriguing. In my reading, for example, I saw the rule of reducing *\ comments to one empty space. I was hoping that somewhere in the literature or easily accessible through source code I could find other rules. Brenda has supplied me with a verbal answer about the use of braces and semi-colons and I don't distrust any advice that she would give me. However, as I was reading about compiling I thought "Just maybe there is an easy way to see for myself what is happening internally when the compiler translates." It wouldn't have been the first time that I had discovered that something that I thought was hard to follow in Linux turned out to in fact be easy. Alas, apparently not this time. > > > I have a couple of other short dumb questions about compiling. If > > > someone is willing to answer them they can email me directly and I will > > > send them to you? I just think they might be boring for the people on > > > this list, and embarrassingly stupid for me. To be honest, most of > > > these questions I could probably sort out for myself but I have already > > > spent more than a day on getting a compiling overview and a little help > > > would be gratefully received. > > > > Ask the list. If dumb questions get answered once then maybe, just > > maybe, the next person who wants to ask the same dumb question will > > search the archives first. I am going to take you at your word and ask the following questions about compiling. OK, here goes! In my info index I have at least the following files related to compiling: make, gcc, flex, yacc, bison, cpp, ld, as, NASM, gdb, DDD and I am sure many others I haven't recognized. I have read at least the introduction and overview of all these files: and much more for some of them. The quoted text below is explanations I have saved concerning the use of these programs either from Info, RedHat manuals or Wikipedia. My basic dumb question is: What are they all for? [That's a rhetorical question.] To flesh out the question more: It seems that 'make' through makefile runs as a script that groups all the files that have to be compiled or re-compiled together so that they can be compiled or linked in one directory. I know it does more but that appears to be the core idea. Somewhere in that process makefile must actually compile the various files, yet when I looked in the default makefile I could not see a command that would start compilation. Does the default makefile use gcc, or, another perhaps built-in compiler? Now, suppose I got the compiler going (I have been using gcc -g foo.c -o foo). I then have the following questions: Quoting from 'info gcc'. "Compilation can involve up to four stages, always in the following order: *preprocessing *compiling *assembling *linking The first three stages apply to an individual source file:" "preprocessing establishes the type of source code to process" "preprocessor A program invoked by various compilers to process code before compilation. For example, the C preprocessor, cpp, handles textual macro substitution, conditional compilation and inclusion of other files." I suppose from the above, preprocessing finds and/or adds the files in #include or replaces all the constants with the name with the value given in #define value? I am assuming that any #statements are what are called macros and the preprocessor takes care of macros? Does preprocessor perform any other core functions? I have a list of options or commands. When would I use them? Or, should I just forget about them for the time being? Is there a specific call in gcc to cpp to perform the preprocessing, or, does it have its own built-in preprocessor? If the gcc preprocessor is built-in when would I use cpp? "compiling produces an object file," Brenda says: "Well, the gcc is the Gnu Compiler Collection. It has front ends for reading and parsing programs (a front end for each language), back ends for spitting out machine code (a back end for each architecture/os), and a middle section that connects the front end and back end and does optimization." After preprocessing does gcc front end call flex and then yacc? In a few words, what is optimization -- all the blanks are gone; all the syntax has already been rearranged? Then does it call what?? Is there a separate program that does spitting out of machine code? Or, is that part of the gcc coding? It sounds like the preprocessor, assembler and linker do everything? "assembling establishes the syntax that the compiler expects for symbols, constants, expressions and the general directives." This sounds like the lexicon and syntax stage, but apparently that has been already done by the compiler? It sounds like the preprocessor, compiler and linker do everything, why do I need the assembler? If I print the file after the assember stage, I can see my source code changed into assembler code, can't I? Shouldn't I see assembler code after the compiler is finished? "The last stage, linking, completes the compilation process, combining all object files (newly compiled, and those specified as input) into an executable file." Is this where the code (after being twisted into a meaningful lexicon and syntax, or assembler code) finally gets turned into binary? How does the machines instruction set get used then? "If you only want some of the stages of compilation, you can use -x (or filename suffixes) to tell gcc where to start, and one of the options -c, -S, or -E to say where gcc is to stop." How would you do that? Wouldn't you have to name the file+suffix? "ld combines a number of object and archive files, relocates their data and ties up symbol references. Usually the last step in compiling a program is to run ld." Does gcc call ld or does it have its own linker? Is this when the functions referenced through the header #include get added to the compiled program? Is this just a reference or is the actual binary for the function added into the compiled file? Isn't there something about dynamically linked files (dll) that applies here? What are they in relation to a compiled program? > I agree. We don't usually find your questions dumb anyway, often > they seem simple but spark some interesting discussions. > > So if you finally do ask a dumb one, you've earned the right :-) > > cheerio, > bjb Lastly, why would a distribution like Fedora Core 4 install all this compiling paraphernalia? Its not a vicesious question. Is there some use for extra preprocessors, assemblers, etc. that I am unaware of? Regards Bill P.S. For Brenda. Brenda I appreciate your comforting me and my small ego with the reassurances that you have given me on several occasions that my questions are not dumb or stupid. If you can keep a secrete I will make a confession. My claiming that a question is probably a stupid one is an old habit. I used to work professionally analyzing provincial government programs and/or businesses in order to write communications plans or business plans. When it came to asking the core questions I got in the habit of prefacing those questions with "This may be a dumb question." It was partly a joke or tease. If the question was genuinely core, the respondent usually got either very quiet or began to talk really fast. If it was really a stupid question and showed that I had missed the mark, then the joke was on me. Friends who know me usually answer one of three ways: 1) By responding thoughtfully, 2) By listening carefully and then pointing out that I have mis-posed the question and should have asked an alternative one, or, 3) By saying something like "Yea, your right that is a really stupid question." My wife, if she were here, would reassure you that protecting Bill's ego is probably pretty low on the list of things that need to be done in this world. Thanks for all your help.