Knowledge in Computer Science

RADIAL BASIS FUNCTION NETWORKS

RADIAL BASIS FUNCTION NETWORKS

Compilation tool chain - part 1

The compilation tool chainFor large programs, the compiler is actually part of a multistep tool chain[preprocessor] → [compiler] → [assembler] → [linker] → [loader]We will be primarily focused on the second element of the chain, the compiler. Our target language will be assembly language. I give a very short description of the other components, including some historical comments.PreprocessorsPreprocessors are normally fairly simple as in the C language, providing primarily the ability to include files and expand macros. There are exceptions, however. IBM's PL/I, another Algol-like language had quite an extensive preprocessor, which made available at preprocessor time, much of the PL/I language itself (e.g., loops and I believe procedure calls).Some preprocessors essentially augment the base language, to add additional capabilities. One could consider them as compilers in their own right, having as source this augmented language (say Fortran augmented with statements for multiprocessor execution in the guise of Fortran comments) and as target the original base language (in this case Fortran). Often the “preprocessor” inserts procedure calls to implement the extensions at runtime.AssemblersAssembly code is an mnemonic version of machine code in which names, rather than binary values, are used for machine instructions, and memory addresses.Some processors have fairly regular operations and as a result assembly code for them can be fairly natural and not-too-hard to understand. Other processors, in particular Intel's x86 line, have let us charitably say more interestinginstructions with certain registers used for certain things.My laptop has one of these latter processors (pentium 4) so my gcc compiler produces code that from a pedagogical viewpoint is less than ideal. If you have a mac with a ppc processor (newest macs are x86), your assembly language is cleaner. NYU's ACF features sun computers with sparc processors, which also have regular instruction sets.

Compilation tool chain - part 2

Two pass assemblyNo matter what the assembly language is, an assembler needs to assign memory locations to symbols (called identifiers) and use the numeric location address in the target machine language produced. Of course the same address must be used for all occurrences of a given identifier and two different identifiers must (normally) be assigned two different locations.The conceptually simplest way to accomplish this is to make two passes over the input (read it once, then read it again from the beginning). During the first pass, each time a new identifier is encountered, an address is assigned and the pair (identifier, address) is stored in a symbol table. During the second pass, whenever an identifier is encountered, its address is looked up in the symbol table and this value is used in the generated machine instruction.LinkersLinkers, a.k.a. linkage editors combine the output of the assembler for several different compilations. That is the horizontal line of the diagram above should really be a collection of lines converging on the linker. The linker has another input, namely libraries, but to the linker the libraries look like other programs compiled and assembled. The two primary tasks of the linker areRelocating relative addresses.Resolving external references (such as the procedure xor() above).

Compilation tool chain - part 3

Relocating relative addressesThe assembler processes one file at a time. Thus the symbol table produced while processing file A is independent of the symbols defined in file B, and conversely. Thus, it is likely that the same address will be used for different symbols in each program. The technical term is that the (local) addresses in the symbol table for file A are relative to file A; they must be relocated by the linker. This is accomplished by adding the starting address of file A (which in turn is the sum of the lengths of all the files processed previously in this run) to the relative address.Resolving external referencesAssume procedure f, in file A, and procedure g, in file B, are compiled (and assembled) separately. Assume also that f invokes g. Since the compiler and assembler do not see g when processing f, it appears impossible for procedure f to know where in memory to find g.The solution is for the compiler to indicated in the output of the file A compilation that the address of g is needed. This is called a use of g. When processing file B, the compiler outputs the (relative) address of g. This is called the definition of g. The assembler passes this information to the linker.The simplest linker technique is to again make two passes. During the first pass, the linker records in its “external symbol table” (a table of external symbols, not a symbol table that is stored externally) all the definitions encountered. During the second pass, every use can be resolved by access to the table.

Compilation tool chain - part 4

LoadersAfter the linker has done its work, the resulting “executable file” can be loaded by the operating system into central memory. The details are OS dependent. With early single-user operating systems all programs would be loaded into a fixed address (say 0) and the loader simply copies the file to memory. Today it is much more complicated since (parts of) many programs reside in memory at the same time. Hence the compiler/assembler/linker cannot know the real location for an identifier. Indeed, this real location can change.

Structure of Compiler

The Structure of a CompilerModern compilers contain two (large) parts, each of which is often subdivided. These two parts are the front end, shown in green on the right and the back end, shown in pink.The front end analyzes the source program, determines its constituent parts, and constructs an intermediate representation of the program. Typically the front end is independent of the target language.The back end synthesizes the target program from the intermediate representation produced by the front end. Typically the back end is independent of the source language.This front/back division very much reduces the work for a compiling system that can handle several (N) source languages and several (M) target languages. Instead of NM compilers, we need N front ends and M back ends. For gcc (originally standing for Gnu C Compiler, but now standing for Gnu Compiler Collection), N=7 and M~30 so the savings is considerable.Multiple PhasesThe front and back end are themselves each divided into multiple phases. The input to each phase is the output of the previous. Sometime a phase changes the representation of the input. For example, the lexical analyzer converts a character stream input into a token stream output. Sometimes the representation is unchanged. For example, the machine-dependent optimizer transforms target-machine code into (hopefully improved) target-machine code.The diagram is definitely not drawn to scale, in terms of effort or lines of code. In practice the optimizers, especially the machine-dependent one, dominate.Conceptually, there are three phases of analysis with the output of one phase the input of the next. The phases are called lexical analysis or scanning, syntax analysisor parsing, and semantic analysis.

Software Productivity Tool

Software Productivity ToolsDataflow techniques developed for optimizing code are also useful for finding errors. Here correctness is not an absolute requirement, a good thing since finding all errors in undecidable.Type CheckingTechniques developed to check for type correctness (we will see some of these) can be extended to find other errors such as using an uninitialized variable.Bounds CheckingAs mentioned above optimizations have been developed to eliminate unnecessary bounds checking for languages like Ada and Java that perform the checks automatically. Similar techniques can help find potential buffer overflow errors that can be a serious security threat.Memory-Management ToolsLanguages (e.g., Java) with garbage collection cannot have memory leaks (failure to free no longer accessible memory). Compilation techniques can help to find these leaks in languages like C that do not have garbage collection.

Parse tree and ambiguity in compiler design

Parse treesWhile deriving 7+4-5, one could produce the Parse Tree shown on the right.You can read off the productions from the tree. For any internal (i.e., non-leaf) tree node, its children give the right hand side (RHS) of a production having the node itself as the LHS.The leaves of the tree, read from left to right, is called the yield of the tree. We call the tree a derivation of its yield from its root. The tree on the right is a derivation of 7+4-5 from list.AmbiguityAn ambiguous grammar is one in which there are two or more parse trees yielding the same final string. We wish to avoid such grammars.The grammar above is not ambiguous. For example 1+2+3 can be parsed only one way; the arithmetic must be done left to right. Note that I am not giving a rule of arithmetic, just of this grammar. If you reduced 2+3 to list you would be stuck since it is impossible to further reduce 1+list (said another way it is not possible to derive 1+list from the start symbol).

Tree traversal in compiler design

Tree TraversalsWhen traversing a tree, there are several choices as to when to visit a given node. The traversal can visit the nodeBefore visiting any children.Between visiting children.After visiting all the childrenI do not like the book's code as I feel the names chosen confuses the traversal with visiting the nodes. I prefer the following pseudocode, which also illustrates traversals that are not depth first. Comments are introduced by -- and terminate at the end of the line. procedure traverse (n: node) -- visit(n); before children if (n is a leaf) return; c = first child; traverse (c); while more children -- visit (n); between children c = next child; traverse (c); end while; -- visit (n); after children end traverse; If you uncomment just the first visit, we have a preorder traversal, where each node is visited before its children.If you uncomment just the last visit, we have a postorder traversal or depth-first traversal, where each node is visited after all its children.If you have a binary tree (all non-leaf nodes have exactly 2 children) and you uncomment only the middle visit, we have an inorder traversal.If you uncomment all three visits, we have an Euler-tour traversal.If you uncomment two of the three visits, we have an unnamed traversal.If you uncomment none of the visits, we have a program that accomplishes very little.

Computer Networks

Compressive detail on the computer networks and it's types in modern computing system .

Winter Of Code - Coding Competition

Are you thinking of getting Internship and Placement at the biggest Tech-giants Microsoft, Amazon, Samsung?We are here for you with the biggest opportunity to practice and experience a test of the same level !!Winter of CodesIt’s a free online Coding Competition Eligibility: Working Professionals and Students from all years can participate.Date: Every Saturday, December.Time : 9:00p.mDuration : 90minsWhat's there in the test?Coding, Aptitude and Logical ReasoningRewards :-Cash Prizes-MI Bands-Free Online Courses And much more...Competing with more than 150 colleges from all over India will give you the grand opportunity to face competition!Willing to participate?_Register now and visit the link for more details :https://codingninjas.in/events/winter-of-codes?campaign=winterofcodes&source=ca405