----- Forwarded message from "Brenda J. Butler" <bjb> ----- Date: Thu, 29 Sep 2005 23:25:34 -0400 From: "Brenda J. Butler" <bjb> Subject: Re: [OCLUG-Tech] Re: Parsing when compiling C - generalized understanding question? To: William Case <billlinux [ at ] rogers [ dot ] com> On Thu, Sep 29, 2005 at 08:22:12PM -0400, William Case wrote: > It seems that 'make' through makefile runs as a script that groups all > the files that have to be compiled or re-compiled together so that they > can be compiled or linked in one directory. I know it does more but > that appears to be the core idea. Close. Make is like a souped-up shell script that compiles all your files for you. The extra feature that make offers besides collecting a bunch of commands in one place, is: dependencies: Through the makefile, you can tell make which executables or libs are created from which sources. So when a change is made to certain sources, only the affected objects, executables and libs are re-generated. > Somewhere in that process makefile must actually compile the various > files, yet when I looked in the default makefile I could not see a > command that would start compilation. Does the default makefile use > gcc, or, another perhaps built-in compiler? Makefiles are, as you say, the config file for the make program. However, make has a lot of "built-in" rules and dependencies. It knows it has to run the c compiler to change a .c file into a .o file (and it has a lot of other rules built-in). You can get make to dump out all its default rules with an option: make -p -f/dev/null Makefiles have various types of entries. Two of the most common are macro definitions and rules. A macro definition looks like: CFLAGS = -g -Wall and a rule looks like: target: objs $(CC) $(CFLAGS) objs -o target The line (or lines) in the target indented by tabs are essentially a little shell script. Those lines really get executed by a shell script. In this case, you can see the c compiler is being called to compile or link objs into target. In some makefiles, the author relies entirely on built-in rules and you might not see such rules unless you dump out the database. It gets a lot more complicated, but that's a start for you. > Now, suppose I got the compiler going (I have been using gcc -g foo.c -o > foo). I then have the following questions: > > Quoting from 'info gcc'. > > "Compilation can involve up to four stages, always in the following > order: > > *preprocessing > *compiling > *assembling > *linking > > The first three stages apply to an individual source file:" > > "preprocessing establishes the type of source code to process" > > "preprocessor A program invoked by various compilers to process > code before compilation. For example, the C preprocessor, cpp, > handles textual macro substitution, conditional compilation and > inclusion of other files." > > I suppose from the above, preprocessing finds and/or adds the files in > #include or replaces all the constants with the name with the value > given in #define value? > > I am assuming that any #statements are what are called macros and the > preprocessor takes care of macros? > > Does preprocessor perform any other core functions? I have a list of > options or commands. When would I use them? Or, should I just forget > about them for the time being? Preprocessors do text substitution, inclusion of other files, and conditional inclusion/exclusion. This gets done before the compiler sees it. Yes, the # statements are preprocessor directives, of which macros are one kind. > Is there a specific call in gcc to cpp to perform the preprocessing, or, > does it have its own built-in preprocessor? gcc calls the preprocessor. You can ask gcc to stop after preprocessing and dump out the results for you to inspect: gcc -E input -o output > If the gcc preprocessor is built-in when would I use cpp? > > "compiling produces an object file," > > Brenda says: > > "Well, the gcc is the Gnu Compiler Collection. It has front ends for > reading and parsing programs (a front end for each language), back ends > for spitting out machine code (a back end for each architecture/os), and > a middle section that connects the front end and back end and does > optimization." > > After preprocessing does gcc front end call flex and then yacc? gcc calls the preprocessor. Then gcc compiles. The compilation step has those three components (front-end, middle, back-end). Then gcc calls the linker. > In a few words, what is optimization -- all the blanks are gone; all the > syntax has already been rearranged? Bart could probably do a much better job answering this one, but anyway: things that might get done in the optimization step are getting rid of intermediate variables if you can make do with fewer, re-arranging the code to be more efficient, etc. Removal of dead code and unused variables. Moving code into or out of loops. That kind of thing. > Then does it call what?? Is there a separate program that does spitting > out of machine code? Or, is that part of the gcc coding? The compile stage is all one program. However, a fortran compiler might share a middle section and backend with a C compiler. How this is implemented, I'm not sure. Does it use shared objects (equivalent of dll as you mention above) or does each compiler just get built from a shared set of sources? Something like that. > It sounds like the preprocessor, assembler and linker do everything? For regular programs, written in a programming language like fortran or C, yes. > "assembling establishes the syntax that the compiler expects for > symbols, constants, expressions and the general directives." > > This sounds like the lexicon and syntax stage, but apparently that has > been already done by the compiler? > > It sounds like the preprocessor, compiler and linker do everything, why > do I need the assembler? Sometimes compiler backends produce assembly language rather than actual machine code. Then the last step is to assemble the assembly language into machine code. > If I print the file after the assember stage, I can see my source code > changed into assembler code, can't I? Shouldn't I see assembler code > after the compiler is finished? > You can get gcc to stop after that stage too: gcc -S input.c -o input.S You will see assembly language code (which is dressed-up machine code). > "The last stage, linking, completes the compilation > process, combining all object files (newly compiled, and those > specified as input) into an executable file." > > Is this where the code (after being twisted into a meaningful lexicon > and syntax, or assembler code) finally gets turned into binary? No, it gets changed to binary in the compile stage, of which assembly is the last part (it's the spit-out-machine-code part). > How does the machines instruction set get used then? > > "If you only want some of the stages of compilation, you can use -x (or > filename suffixes) to tell gcc where to start, and one of the options > -c, -S, or -E to say where gcc is to stop." > > How would you do that? Wouldn't you have to name the file+suffix? See above... > "ld combines a number of object and archive files, relocates their data > and ties up symbol references. Usually the last step in compiling a > program is to run ld." > > Does gcc call ld or does it have its own linker? gcc calls ld for you. > Is this when the functions referenced through the header #include get > added to the compiled program? #include is a preprocessor step, happens before compilation. > Is this just a reference or is the actual binary for the function added > into the compiled file? Hey! We're way past two questions here! > Isn't there something about dynamically linked files (dll) that applies > here? What are they in relation to a compiled program? dynamically linked files are libraries that get linked at run time instead of at compile/link time. > Lastly, why would a distribution like Fedora Core 4 install all this > compiling paraphernalia? Its not a vicesious question. Is there some > use for extra preprocessors, assemblers, etc. that I am unaware of? The software developers who make the distro leave it in. "facetious" > My claiming that a question is probably a stupid one is an old habit. I > used to work professionally analyzing provincial government programs > and/or businesses in order to write communications plans or business > plans. When it came to asking the core questions I got in the habit of > prefacing those questions with "This may be a dumb question." It was > partly a joke or tease. If the question was genuinely core, the > respondent usually got either very quiet or began to talk really fast. > If it was really a stupid question and showed that I had missed the > mark, then the joke was on me. ... > My wife, if she were here, would reassure you that protecting Bill's ego > is probably pretty low on the list of things that need to be done in > this world. Ok, but don't threaten to ask privately again ! :-) cheerio, bjb ----- End forwarded message -----