How would I be able to make my own?
Printable View
How would I be able to make my own?
GCC itself does not include a linker. It emits assembly files. It then relies on an assembler and a linker being available, which it invokes for the assembling and linking steps. The GNU assembler (gas) and linker are part of the GNU binutils package.
The MinGW port includes the binutils, so that a linker is available.
If you want to write a linker yourself you need to understand both the object file format you want to link and the executable format you want as a result. The first step is to simply put all code into a single file. Then you walk through the code and resolve the linker references that the assembler put in. Those are placeholders for addresses that only the linker can determine. For the more complicated executable formats (PE, ELF) the linker must then generate relocation tables for all symbols (in case the file can't be loaded to the desired memory location).
There's some more stuff a linker must do, especially in C++: kicking out duplicate code, for example. (That's a result of the template mechanism.)
Maybe you guys should start to take donations so you can afford that stuff ;).Quote:
Originally Posted by chemicalNova
You could save your allowance, or try getting a job. You're old enough. Might find one on eBay real cheap. Just be sure there's something on compiling into a Win32 exe, and has source code on linkers. Those will come in handy.Quote:
Originally Posted by chemicalNova
Once again I'd like to raise the point: Which ASM compiler are we going to ship with LF? We need a free redistributable one (GCC/GNU/Cygwin/WinG?). I think it would be best to use the same compiler from the start.
I think if we use C++/MASM now and a different ASM compiler later on, we will encounter problems, but I could be wrong. I know most x86 ASM compilers are the same, but there has to be some differences, right? And those differences could cause errors.
Don't forget about a linker. I'm still trying to figure out where to find one, or source code to make one.
I was thinking of just using ASM in C++ rather than using MASM, GCC, or whatever. As long as we have access to the x86 instruction set, not to mention be able to use SIMD, then what's to fuss about?
These two links alone will help me somewhat with knowing the bare bones of exes:
http://www.madchat.org/vxdevl/papers...ile/pefile.htm
http://www.thecodeproject.com/win32/vbdebugger.asp
Just need to figure out how to get the compiler and linker to convert it to this.
Oh and chemical nova hooked me up with this:
http://compilers.iecc.com/crenshaw/
2 things:Quote:
Originally Posted by Jacob Roman
1) I think the __asm{} part of C++ is MASM (if you are using Vis Studio).
2) We can't redistribute the ASM compiler within C++ when we release LF. We need to be using an ASM compiler and linker that we can reditribute.
What do you mean we can't redistribute it? Of course we can.
Okay, one of us is missing something, maybe its me. Here is how I am viewing things (assume LightFusion is made and done):
Code is written in LF IDE
-> Code is sent to an interpreter (which outputs ASM)
-> ASM is sent to an ASM compiler (point which I will discuss below)
-> ASM compiler outputs object files
-> Linker puts object files into EXE
When the ASM is sent to a compiler, we need to be using a compiler that we can redistribute. Same goes for the linker.
Do you see what I mean now?
Here is an updated version of the .chm. Let me know what you think.
Just two lines on terminology:
You send code to a compiler that outputs assembly.
You send assembly to an assembler that outputs object code.
So there's no "asm compiler", and an "interpreter" doesn't output code.
Thanks CornedBee. :)Quote:
Originally Posted by CornedBee
Once again, I am very new to this.
Doesn't an interpreter change code from 1 type of code to another. IE - From LF to ASM?
No, an interpreter doesn't change code, it executes it as it is. The CPU "interprets" machine code. The JavaVM sometimes interprets bytecode. The Rhino engine interprets JavaScript.
Everything that changes one form of code into another is called a compiler (unless it's a preprocessor - a preprocessor doesn't understand all the code it handles, only parts of it, and passes the rest unchanged through to the output). A C++ compiler converts C++ code to Assembly (or possibly C, to be again compiled by a C compiler). A Haskell compiler converts code to Assembly (or again C - it's a popular compilation target). An assembler (which is actually a compiler, although it's not called that) converts assembly to machine code.
The edges are blurred, though. The .Net CLR should be called an interpreter based on what it does (executing .Net IL), but its inner technology is based on JIT compiling, so it's a compiler internally. Same applies in some situations to the JavaVM. The Rhino JavaScript engine is capable of compiling some JavaScript to Java bytecode. The PHP interpreter can compile PHP to bytecode. The Perl interpreter can compile scripts to bytecode. The Haskell interpreter Hugs compiles files before executing them.
Thanks. :)
That clears it up nicely for me.
One more question on the topic (for now :)): Are bytecode and machinecode the same thing?
Jake, are you ok with "making your programming dreams come true" as our unofficial motto for now?
Here is an ASM tutorial I found at school the other day:
http://www.drpaulcarter.com/pcasm/index.php
(scroll to the bottom to download the PDF or PostScript formats of the book)
Its shorter than AoA and will give me a good overview. I will still read AoA, but just later.
You guys might want to decide soon. Different instruction sets have different registers, and different functions with different uses. If you want any help apart from you two, we kinda need to know whats going to happen :)
chem
- Each CPU supports specific instruction sets. Assemblers must also support those instruction sets, but they are the same. An x86 assembler produces x86 machine code which runs on an x86 CPU.
- The way we are discussion the compiler you'd think we had finished the language. Yet I don't even know the half of what it's supposed to be like yet...
Also, different Assemblers, no matter what their final CPU support, have different syntaxes.
chem
Kind of. Both are binary-encoded instructions that can be rapidly parsed and executed. However, machine code refers to things that are primarily directly executed by some existing hardware, while bytecode refers to things that are primarily executed by some interpreter/JIT-compiler like the JavaVM, the .Net CLR and similar things. Bytecode thus is kind of platform-independent.Quote:
Originally Posted by eyeRmonkey
If the compiler and linker get too complictated, I can always have the LightFusion syntax converted to C++ rather than assembly, and we can use a C++ compiler and linker which will do it for us. That's just incase though.
Jake, you still have yet to answer my question.
I think we need to use the same assembler to write the LF compiler that we use to compile the code that LF produces. That means it needs to be redistributable. Do you understand what I am asking?
Once again, maybe I am missing something, but I don't think so.
Penagate/Chem, like I said earlier, after I finish documenting ideas up to this point in the thread and in other LF related threads, I will bring the focus back to a list of goals for LightFusion (the lists that CornedBee brought up). Unless someone wants to bring it up now...
I think I might be confused on the basics of what we are trying to do here. Let me re-explain what I am thinking (while probably using some more terms wrong ;)).
1) A user writes code in the LightFusion IDE
2) That code is translated into ASM
3) That ASM code is sent to an assembler (which we don't write ourselves - IE: We use one that is already made and that we can redistribute)
4) The object files that the assembler outputs are linked using another program that we don't write outselves.
Am I correct in those assumptions? Because I get feeling you were thinking of something else Jake.
You disagreed with that in the begining. Now you're doing that?Quote:
Originally Posted by chemicalNova
Could have saved a whole lot of discussion..
chem
I never disagreed with it. I thought thats what we were doing the whole time. I think Jake is considering writing an compiler that doesn't rely on an assembler and linker. Or he is making the assembler and linker himself.
I truely don't see the point in that. There are already many assemblers/linkers out there that have tons of optimizations available and that we will never be able to compete with. I think that if we make a compiler that does all that then we are dooming ourselves to failure because that means A LOT more work and many more chances for us to throw in the towel.
I think writing an IDE and a compiler that outputs ASM (which is automatically fed into a redistributable asm compiler) would be easier. We could focus on optimizing the way the ASM is output and that would be the focus of our speed.
Jake, does you book bypass the ASM translation part in a sense? I mean does it each you to parse the languge you make and output directly to obj files which are fed into a linker?
Jake, I was just re-reading the entire thread and noting anu suggestions made into a few .docs and I have a few points to bring up with you:
1) There were about 10 or so times where people asked you a question and you didn't respond to them. If we are going to do this project then we need to communicate. I make every effort to reply to every post that I have any input in, and I definitely reply to every post that is directed at me.
2) A lot of the things you said implied you have been doing lots of work (you mentioned you started the IDE, and a few other things). Thats awesome, but I think we should be posting our work as we go. I can understand that you may not feel it is ready to be posted yet, but I hope you will post things when you can (documentation, sorces, whatever).
If you post stuff, you'll get more suggestions so it can only improve ;)
By "stuff" I assume you mean code, documentation, mini-apps, uncompleted .exe's, etc.?Quote:
Originally Posted by manavo11
Obviously :) Posting stuff about the project and how it's coming along, not random jokes or something :)Quote:
Originally Posted by eyeRmonkey
:lol:
Ok. :)
Alright, I am done documenting everything thus far. Everything is sorted into a few .docs and is fairly sloppy, but at least now the suggestions we have gathered so far won't get lost or forgotten about.
It's time to focus on deciding the foundations of the language. It's something we have all managed to avoid in the excitment of getting this project going, but now its time to get down to business.
Many people have told us that "a faster VB/C++" is not enough to build a language around and they are right so we are going to use the lists that CornedBee posted that were the foundation for C++ as a template for our own similar lists.
The rules we decide to put on these lists will be what governs the rest of the development of this project so lets pick carefully (and of course we can add/change/remove things later).
Here is what we have so far as "The Rules":
Aims:
* LightFusion will be easy to learn and make standard windows applications without lacking in speed of execution.
* LightFusion will make programming enjoyable for anyone who uses it.
* LightFusion will combine the popular features of languages such as Visual Basic and C++ and add features that seem to be lacking from those languages.
General Rules:
* Always make what the compiler is doing obvious to the programmer (through the syntax, documentation, features, etc) without confusing things too much.
* Don't force the programmer.
* Every feature must have a reasonably obvious implementation.
Design Support Rules:
* Support composition of software from separately developed parts.
* Support common programmer styles.
* Support program organization and readability.
Language/Technical Rules:
* Use features/syntax from other languages unless there is a good reason not to.
* When in doubt, pick a solution that is easiest to teach and displays what the compiler is doing.
What do you think should changed/added/removed?
My 2 cents:
I think this rule "Always make what the compiler is doing obvious to the programmer (through the syntax, documentation, features, etc) without confusing things too much." is the most important (and not just because I thought of it).
The reason I say that is because no matter how much work we put into making a good compiler and no matter how much we optimize things, a huge ammount of the burden of speed is on the programmer. I think that if we do our best to let the programmer know what is fastest for each situation then LightFusion programs will be turn out faster on average.
I think this needs to be done through the syntax and the naming of library functions mainly. If we just do it through documentation and features (such as allowing the programmer to edit the ASM before it is sent to the assembler) then those things can be ignored. But if we play our cards right, then the user will HAVE to know what is faster BECAUSE they know the language. Here is a (poor) example of what I mean:
This is obviously a horrible example, but I can't think of a better way to illustrate my point because I have never seen this done in a programming language before. My might sacrafice some use-ablity in the process, but it might be worth it.Code:Fast1LineIf a = 1 : Call MySub()
SlowMultiLineIf b = 10 {
c = a * b
Call MySub1(c)
}
This may, on the other hand, be a bad idea. Because all our users may not be interested in speed and it may make the language harder to use/read. We could just offer features (like editing the ASM) for the users who WANT fast apps.
What do you guys think?
What is this?
VB Code:
Fast1LineIf a = 1 : Call MySub() SlowMultiLineIf b = 10 { c = a * b Call MySub1(c) }
You don't check for c's value, don't increment a or b, and don't say what MySub does. Plus you call it with a parameter once, and then again without one.
It would take a pyschic to know that syntax, I think! :)
Apart from that. If blocks compile to the same ASM anyway (I know I'm picking on your example)
Code:if (a = 1) { mySub() }
Code:cmp a, 1
jne EndIf
call mySub
EndIf:
Code:if (b = 10) {
c = a * b
mySub1(c)
}
As you can see the overhead of the If construct is the same in both situations, two instructions only.Code:cmp b, 10
jne EndIf
mov c, a
imul c, b
push c
call mySub1
EndIf:
:lol: Sorry. I guess I was no where near clear enough on that. My point was that the syntax should tell what is fastest for which situation. After thinking it over more, it might not be the best idea, but here is what I meant (assuming a 1 line if statement is faster in ASM than a multi-line IF statement - which might be a bad assumption):
VB Code:
Fast1LineIf a = 1 { // Do something short } SlowMultiLineIf a = 2 { // Do a lot of work // Do more work }
Like I said, it is a horrible example, but my point is still that it might be a good idea to let the programmer know what is fastest through the syntax. I have been thinking about it more since I posted that, and I think its not such a great idea because it is hard to implement (without making the language hard to write).
After more thought I realized that its easier to make the syntax/function names similar to ones that are already around and make a strong documentation of the language so that IF the programmers goal is speed then they have a good oppourtunity to reach that goal.
Also we will try to offer the programmer access to the ASM before it is sent to the compiler, so there is another chance for speed. I just had another idea...
IDEA: Give an option to output an execution flow in text of the program. I think this could really help debugging. What I mean is if you had some code that looked like this (in VB):
VB Code:
Sub MySub() a = Val(MsgBox("A Number?") a = a + MyFunc(a) End Sub Function MyFunc(val) MyFunc = val * 3.14159 End Functoin
... then the execution flow output would look something like this:
Code:Start myApp
Call MySub
Show MsgBox (Return: "10")
a = 10
Call MyFunc(val:=10)
MyFunc Return = 31.4159
a = 10 + 31.4159 = 41.4159
End myApp
It shouldn't necessarily make it any harder to read, as long as it is done properly.Quote:
Originally Posted by eyeRmonkey
Plus, an experience programmer can tell why some code is faster and some slower anyway, without an in-depth knowledge of the language. Take VB's IIf() for example:
We know that is slow because both expressions are always evaluated. So even if b does not equal c, it will still break on the divide by zero. Many might see this as a weakness in VB. But, we know that IIf is a function; it accepts 3 parameters and returns a result. Without any knowledge of VB itself you hence know that the 3 expressions are evaluated before they are passed to the function.VB Code:
a = IIf((b = c), a / 0, a * 5)
VB Code:
Function IIf(Expression, TrueResult, FalseResult) If (Expression) Then IIf = TrueResult Else IIf = FalseResult End If End Function
Now on the other hand if conditional assignment was a part of VB, like it is C++, C# etc., then you remove the function wrapper, and besides saving on the calling overhead you also only evaluate one expression.
VB Code:
a = (b = c) ? a / 0 : a * 5
And the reason for that long-winded explanation is to show that if you know basics of low-level programming, you don't require an in-depth knowledge of a language to tell which programming methods are fast and which are slow.
That brings me to my next point however, which is that you do require detailed knowledge of a language to tell which built-in syntatic features are faster than others. If these features look the same (say, one line of code) then I'm not sure how we are going to make it obvious to tell between faster and slower methods.
Yeah... see my post above :)Quote:
Originally Posted by eyeRmonkey
Ok, I am completely abandoning the idea of naming functions/keywords to make the programmer know which is faster, because its really not possible.
I am not, on the other hand, giving up on features like giving the programmer a chance to edit the ASM and giving good documentation.
i dont understand this...Quote:
Originally Posted by eyeRmonkey
why would you want a slow function if you already have a faster version?
Because it has greater capabilities and doesn't only handle special cases. Or because it's more precise. Take the exponentiation earlier in this thread. The special case for integer powers only is far faster than the function that can handle floating point powers. Or the square rooting. You can write approximation functions that are pretty quick, but they are, well, approximations.
Quote:
Originally Posted by eyeRmonkey
You still haven't gotten it, eyeR. The compiler is not written entirely in assembly. That's why I said the ASM stuff has to be in C++. And it still will be redistributable. ;)
And I think it would be easier and allow us to have more control if we converted the LightFusion code to C++ and use a C++ compiler and linker. That way there we can easily have the win32 and dos exe's we are wanting.
Okay Jake, I am going to assume you have had a bust weekend. Because the last 10 replies I made were directed somewhat at you and you only replied to the first one.
Back to the compiler: I think you are missing something this time. If we are going to translate LF into C++ and throw that C++ into a C++ compiler and linker then we DEFINITELY need a FREE and REDISTRIBUTABLE C++ compiler and linker to ship with the lightfusion IDE. Do you not see what I am saying? There are 2 separate parts to it:
1) You write the compiler (for LF) in ASM that tranlates code from LF to (ASM?/C++?).
2) Now you have that output from the compiler and it needs to be fed (automatically) into C++/ASM complier and then a linker.
Do you see what I am saying?
Here is what I think you are thinking:
1) You write a compiler in ASM that translates LF into ASM and outputs .obj/.exe files.
Now maybe the book you are reading does that (I think it does as far as .obj files because you mentioned that earlier) but at the very least we need a free/reditributable linker to feed the .obj files into.
Thats a better response than I would have given. At least you understand what I was going for. :)Quote:
Originally Posted by CornedBee
Another example would be a PictureBox and ImageBox in VB. Most noobs don't know that an ImageBox is a lot faster, but has less uses. If LF had a naming convention that let you know which was fastest then I WAS THINKING that might be helpful, but not any longer.
Maybe in a few places we could do that, but for the most part I think it reduces ease of use a little too much and as you gain experience with a language you learn things like that.
IDEA: Once again, since one of goal is speed, I think it would be awesome to ship an exetremely accurate benchmarking timer as part of the standard library.
As far as this I am not sure I agree with you. I don't see how we even have a change at making a language faster than C++ if we are effectively writing it in C++.Quote:
Originally Posted by Jacob Roman
BUT if we do go that way with things then we have that free/redistributable compiler (and I think linker) available to us:
http://msdn.microsoft.com/visualc/vctoolkit2003/
It is the same compiler that Visual Studio.NET uses for C++ and it is free and I think it is redistributable, but I only skimmed the license.
We could write the compiler in VC++ 6 (thats what you have right) and then have it output ASM/C++ that can be fed into the 2003 compiler.
The ImageBox is faster? I wouldn't think so, even though the difference would be barely measureable. The picturebox is just another window. The imageBox is a completely new control is it not?Quote:
Originally Posted by eyeRmonkey
I must be a noob :eek:
chem
I haven't been able to get on the computer much because my computer is dead and I'm stuck using my parents' computer and it's difficult for me to get on when everyone in my family is using it. And I'm still waiting for a couple computer parts to come so I can have my new computer built. So I apologize if I missed any questions.Quote:
Originally Posted by eyeRmonkey
The first thing I have been programming so far is the IDE for LightFusion. It uses an MDI (Multiple Document Interface), to allow multiple code windows, and forms to be displayed. It's going really well so far, but I still have a long way to go. When I get much of that complete, I will go by the book step by step by first making a program lister, then a scanner, then a parser, then the compiler. But around the same time I got a lot of reading, and syntax designing to do.
The asm built into C++ is sufficient enough like I've been trying to tell you. Assemblers such as MASM, TASM, NASM, etc., along with C++'s built in assembler, all use the x86 instruction set. It doesn't matter which one you use because they all communicate with the processor and it's registers the same way. If you were to do something like this in any of the assemblers:Quote:
Originally Posted by eyeRmonkey
it executes the same, and communicates with the same processor manipulating the data within the registers. Plus I've been planning on using the SSE instruction set, which is faster than the x86, and I doubt MASM and those others support it. I do know that C++ supports it. And since the asm code is converted to machine language in the end when made into the exe, the exe itself can be redistrubuted freely without any problems. If you used the compiler to convert simple addition to the code that I have located above, whether the compiler was made from C++, VB6, MASM, or whatever, then it doesn't matter, because the output assembly would have been the same.Code:
mov eax, 5h
mov ebx, 7h
add eax, ebx
Ok fine. ASSUMING your compiler outputs .obj files then we still need a free and redistributable linker to link those object files. Unless you planon making that too?
And if you are serious about LF being translated into C++/ASM and not just ASM then we might as well use a C++ compiler that is already made (VC++ Toolkit 2003 - link above) and only do the parsing/translating ourselves then pass that code to the VC++ compiler. Why reinvent the wheel?
There is a big difference. PictureBox is another window and that is why it is so slow. It has an .HDc and many other properties and events and such that slow it down. If you used a picturebox every time you wanted a JPG on your form then you would be slowing down your program a lot. ImageBox can do that just fine with less memory usage.Quote:
Originally Posted by chemicalNova
We will do that with a twist. Much of the LF code will be converted to SSE assembly in C++, and execute faster than normal C++ operations such as addition, subtraction, multiplication, division, etc.Quote:
Originally Posted by eyeRmonkey
Yes, I see what your saying. But I think you are still missing my point. Take this (made up scenario):Quote:
Originally Posted by Jacob Roman
1) We write a parser for LF in (XXX) language.
2) That parser outputs ASM (right?)
3) The ASM is assembled by __________ (<-- blank = the problem)
4) The .obj files are linked into a .exe
Once again, if the compiler your book has you make, outputs .obj files then all we need is a free linker.
BUT if we want to save some work on our part then we can just write the parsing stuff and feed the ASM that has been parsed into an assembler? Do you want to do it that way? I think we would gain a lot of speed if we did. But if we did then your book wouldnt be that helpful because it skips that step and outputs .obj files (or does it?).
See the question (now in bold) that you missed?Quote:
Originally Posted by Jacob Roman
The parser will be in written in C++, and will output asm, only asm within C++, so we can use the VC++ compiler/linker. So what we are actually doing is making a precompiler.
Answered your question in bold ;)
Well if you have been putting a lot of work into the IDE and syntax please keep us updated. I guess if you don't have access to a computer as often as you like then that is somewhat difficult. But just mention features you are working on for the IDE and post syntax ideas you have. The rest of the team can't do much unless you communicate with us.Quote:
Originally Posted by Jacob Roman
Yes the instruction set it always the same, BUT the syntax for each language is different. INCLUDEs are different in NASM and MASM. Plus MASM has a million more compiler-side features that allow macros and ease of use. So they aren't all exactly the same.Quote:
Originally Posted by Jacob Roman
But here is what you are missing: Every person that downloads the final release of LF must have the VC++ compiler/linker. So either they must have VS .NET or we can make them download VC++ Toolkit 2003 (because I don't think we can redistribute it ourselves). Now do you see what I mean?Quote:
Originally Posted by Jacob Roman
Your link says this:
So we should check in there if we are allowed to distribute the compiler/linker.Quote:
Are there any restrictions on how I use the Visual C++ Toolkit?
In general, no. You may use the Toolkit to build C++ -based applications, and you may redistribute those applications. Please read the End User License Agreement (EULA), included with the Toolkit, for complete details.
I'm a step ahead of you bud. :)
I already copied the EULA to a word document so I can read it over and over. As far as I can tell it only lets us reditribute code that comes with the compiler and not the compiler its self. So our only option (but probably our best option) is to force the LF user to DL that toolkit.
I did it for you. So far I didn't see anything to restrict us from doing so.
BTW - I have been looking into the VC++ toolkit for a couple days and it supports SSE/SEE2 stuff.Quote:
Originally Posted by Jacob Roman
I know. That's why I want to use it ;)