----- Forwarded message from "Brenda J. Butler" <bjb> -----
Date: Thu, 29 Sep 2005 23:25:34 -0400
From: "Brenda J. Butler" <bjb>
Subject: Re: [OCLUG-Tech] Re: Parsing when compiling C - generalized understanding question?
To: William Case <billlinux [ at ] rogers [ dot ] com>
On Thu, Sep 29, 2005 at 08:22:12PM -0400, William Case wrote:
> It seems that 'make' through makefile runs as a script that groups all
> the files that have to be compiled or re-compiled together so that they
> can be compiled or linked in one directory. I know it does more but
> that appears to be the core idea.
Close. Make is like a souped-up shell script that compiles
all your files for you.
The extra feature that make offers besides collecting a bunch
of commands in one place, is:
dependencies: Through the makefile, you can tell make
which executables or libs are created from
which sources. So when a change is
made to certain sources, only the
affected objects, executables and libs are
re-generated.
> Somewhere in that process makefile must actually compile the various
> files, yet when I looked in the default makefile I could not see a
> command that would start compilation. Does the default makefile use
> gcc, or, another perhaps built-in compiler?
Makefiles are, as you say, the config file for the make
program. However, make has a lot of "built-in" rules
and dependencies. It knows it has to run the c compiler
to change a .c file into a .o file (and it has a lot
of other rules built-in). You can get make to dump out
all its default rules with an option: make -p -f/dev/null
Makefiles have various types of entries. Two of the most
common are macro definitions and rules. A macro definition
looks like:
CFLAGS = -g -Wall
and a rule looks like:
target: objs
$(CC) $(CFLAGS) objs -o target
The line (or lines) in the target indented by tabs are essentially
a little shell script. Those lines really get executed by a shell
script. In this case, you can see the c compiler is being called
to compile or link objs into target.
In some makefiles, the author relies entirely on built-in rules
and you might not see such rules unless you dump out the database.
It gets a lot more complicated, but that's a start for you.
> Now, suppose I got the compiler going (I have been using gcc -g foo.c -o
> foo). I then have the following questions:
>
> Quoting from 'info gcc'.
>
> "Compilation can involve up to four stages, always in the following
> order:
>
> *preprocessing
> *compiling
> *assembling
> *linking
>
> The first three stages apply to an individual source file:"
>
> "preprocessing establishes the type of source code to process"
>
> "preprocessor A program invoked by various compilers to process
> code before compilation. For example, the C preprocessor, cpp,
> handles textual macro substitution, conditional compilation and
> inclusion of other files."
>
> I suppose from the above, preprocessing finds and/or adds the files in
> #include or replaces all the constants with the name with the value
> given in #define value?
>
> I am assuming that any #statements are what are called macros and the
> preprocessor takes care of macros?
>
> Does preprocessor perform any other core functions? I have a list of
> options or commands. When would I use them? Or, should I just forget
> about them for the time being?
Preprocessors do text substitution, inclusion of other files, and
conditional inclusion/exclusion. This gets done before the compiler
sees it.
Yes, the # statements are preprocessor directives, of which
macros are one kind.
> Is there a specific call in gcc to cpp to perform the preprocessing, or,
> does it have its own built-in preprocessor?
gcc calls the preprocessor. You can ask gcc to stop after
preprocessing and dump out the results for you to inspect:
gcc -E input -o output
> If the gcc preprocessor is built-in when would I use cpp?
>
> "compiling produces an object file,"
>
> Brenda says:
>
> "Well, the gcc is the Gnu Compiler Collection. It has front ends for
> reading and parsing programs (a front end for each language), back ends
> for spitting out machine code (a back end for each architecture/os), and
> a middle section that connects the front end and back end and does
> optimization."
>
> After preprocessing does gcc front end call flex and then yacc?
gcc calls the preprocessor.
Then gcc compiles.
The compilation step has those three components (front-end,
middle, back-end).
Then gcc calls the linker.
> In a few words, what is optimization -- all the blanks are gone; all the
> syntax has already been rearranged?
Bart could probably do a much better job answering this one,
but anyway: things that might get done in the optimization step
are getting rid of intermediate variables if you can make do
with fewer, re-arranging the code to be more efficient,
etc. Removal of dead code and unused variables. Moving
code into or out of loops. That kind of thing.
> Then does it call what?? Is there a separate program that does spitting
> out of machine code? Or, is that part of the gcc coding?
The compile stage is all one program.
However, a fortran compiler might share a middle section and backend
with a C compiler. How this is implemented, I'm not sure. Does
it use shared objects (equivalent of dll as you mention above) or
does each compiler just get built from a shared set of sources?
Something like that.
> It sounds like the preprocessor, assembler and linker do everything?
For regular programs, written in a programming language
like fortran or C, yes.
> "assembling establishes the syntax that the compiler expects for
> symbols, constants, expressions and the general directives."
>
> This sounds like the lexicon and syntax stage, but apparently that has
> been already done by the compiler?
>
> It sounds like the preprocessor, compiler and linker do everything, why
> do I need the assembler?
Sometimes compiler backends produce assembly language rather
than actual machine code. Then the last step is to assemble
the assembly language into machine code.
> If I print the file after the assember stage, I can see my source code
> changed into assembler code, can't I? Shouldn't I see assembler code
> after the compiler is finished?
>
You can get gcc to stop after that stage too:
gcc -S input.c -o input.S
You will see assembly language code (which is dressed-up
machine code).
> "The last stage, linking, completes the compilation
> process, combining all object files (newly compiled, and those
> specified as input) into an executable file."
>
> Is this where the code (after being twisted into a meaningful lexicon
> and syntax, or assembler code) finally gets turned into binary?
No, it gets changed to binary in the compile stage, of which
assembly is the last part (it's the spit-out-machine-code part).
> How does the machines instruction set get used then?
>
> "If you only want some of the stages of compilation, you can use -x (or
> filename suffixes) to tell gcc where to start, and one of the options
> -c, -S, or -E to say where gcc is to stop."
>
> How would you do that? Wouldn't you have to name the file+suffix?
See above...
> "ld combines a number of object and archive files, relocates their data
> and ties up symbol references. Usually the last step in compiling a
> program is to run ld."
>
> Does gcc call ld or does it have its own linker?
gcc calls ld for you.
> Is this when the functions referenced through the header #include get
> added to the compiled program?
#include is a preprocessor step, happens before compilation.
> Is this just a reference or is the actual binary for the function added
> into the compiled file?
Hey! We're way past two questions here!
> Isn't there something about dynamically linked files (dll) that applies
> here? What are they in relation to a compiled program?
dynamically linked files are libraries that get linked at
run time instead of at compile/link time.
> Lastly, why would a distribution like Fedora Core 4 install all this
> compiling paraphernalia? Its not a vicesious question. Is there some
> use for extra preprocessors, assemblers, etc. that I am unaware of?
The software developers who make the distro leave it in.
"facetious"
> My claiming that a question is probably a stupid one is an old habit. I
> used to work professionally analyzing provincial government programs
> and/or businesses in order to write communications plans or business
> plans. When it came to asking the core questions I got in the habit of
> prefacing those questions with "This may be a dumb question." It was
> partly a joke or tease. If the question was genuinely core, the
> respondent usually got either very quiet or began to talk really fast.
> If it was really a stupid question and showed that I had missed the
> mark, then the joke was on me.
...
> My wife, if she were here, would reassure you that protecting Bill's ego
> is probably pretty low on the list of things that need to be done in
> this world.
Ok, but don't threaten to ask privately again ! :-)
cheerio,
bjb
----- End forwarded message -----