Results 1 to 11 of 11

Thread: memory access in Win32, win64 and general memory access

  1. #1

    Thread Starter
    PowerPoster
    Join Date
    Feb 2017
    Posts
    3,372

    memory access in Win32, win64 and general memory access

    I'm not sure if this section is the proper or best forum section to post these kind of question, but many forum users come here and seems to be a general purpose section. Mods might want to move this thread if they consider that it is not in a proper section.

    I was thinking about the move from win32 to win64, and I've thought that the RAM (and disk) remain with 8 bits.

    Then, I've thought if the four bytes of win32 or the eight bytes of win64 are accessed sequentially, one by one or all at once to put their content to a CPU register, I guess the later, but I'm not sure.

    I performed a quick Google search but the first results didn't seem to address the issue (at least easily), so I decided to ask to someone that already know, for a simple answer.

    (of course, it is just for curiosity and general knowledge)

    TIA.

  2. #2
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    36,473

    Re: memory access in Win32, win64 and general memory access

    I did move the question, because it's a very good question. While Chit-Chat gets visited by lots of people, you also get a bunch of flippant responses, and there is no requirement that anyone stay on topic. I didn't want this particular thread to wander.

    I suspect that the answer comes in different layers. Suppose we are looking for a particular integer. On a 32-bit system, that could be a 32 bit integer in four bytes, while on a 64-bit system that could be a 64-bit integer with eight bytes. Technically, those would be two different data types, but what I want to consider is a "thing that is the width of the register", and am using Integer for convenience.

    If that integer is on the disc, and not anywhere in RAM, then I believe it will not be JUST the integer that gets loaded. I believe it will be the integer, and a large block of memory that contains the integer that gets loaded to RAM. What this means is that, if you then want to find a second integer that is in the same block, it is now in RAM, not on the disc, which means faster access.

    On modern CPUs, my understanding is that the CPU cache works in essentially the same way, though I forget the block size (probably 4K, or some other power of 2 times 1000). Therefore, when you want your Integer, which is now in RAM, the Integer is moved to the cache, along with the Block that is around it. That puts ALL of that memory on the CPU, thereby speeding up access for anything that is in the cache. That works well for lots of things, because bytes in memory tend to be related to those that are near them, such that if you want N, then you are likely to also be wanting M and O in the near future.

    From there, when you move your Integer into a register, you get the whole thing in one shot. It isn't byte0, then byte1 then byte2, and so on. For this reason, it is often a bit faster to use Integers over bytes, despite the byte being smaller. If you have 32 bit registers, and you get a byte, you are going to get 32 bits and have to discard the top 24.

    I may have some things mixed up. It has been over a decade since I've looked at anything with CPUs at that level, but I believe I am at least pretty close, and others can improve on the answer.
    My usual boring signature: Nothing

  3. #3

    Thread Starter
    PowerPoster
    Join Date
    Feb 2017
    Posts
    3,372

    Re: memory access in Win32, win64 and general memory access

    Yes, thank you Shaggy Hiker, that makes sense.
    And when it moves a RAM block to the cache my guess is that it must be several bytes per operation, not byte by byte.

  4. #4
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    36,473

    Re: memory access in Win32, win64 and general memory access

    A few thousand bytes to a page, I believe. My understanding is that chunks are paged in and out of the cache. This has some organizational impact, as there's an advantage to related objects being relatively close together, as they are likely to all be in the cache at the same time.
    My usual boring signature: Nothing

  5. #5

    Thread Starter
    PowerPoster
    Join Date
    Feb 2017
    Posts
    3,372

    Re: memory access in Win32, win64 and general memory access

    I mean if when copying from the RAM memory module to the CPU cache, it is done per single bytes or several bytes at once (in parallel)

  6. #6
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    36,473

    Re: memory access in Win32, win64 and general memory access

    I would guess that the best answer might be "both", but that's not something I have any knowledge on. I would guess that the bus is not bringing single bytes to the CPU cache in each cycle, but is bringing some larger number, which may be 4, 8, or 16 bytes. Then, on top of that, it is doing that over and over such that it is fetching several thousand bytes overall. That at least used to be the case, see the Transfer Rates section of this page on the Front-Side bus. However, that's old technology, with the Intel_QuickPath_Interconnect being more modern.
    My usual boring signature: Nothing

  7. #7
    PowerPoster
    Join Date
    Feb 2006
    Posts
    22,741

    Re: memory access in Win32, win64 and general memory access

    I think the CPU's supporting chipset dictates things like the width of memory in bits (multiples of 8 plus possible parity), block size for cache loading, etc. Of course the chipset must be compatible with the CPU.

    Front-side bus might also be helpful.

  8. #8

    Thread Starter
    PowerPoster
    Join Date
    Feb 2017
    Posts
    3,372

    Re: memory access in Win32, win64 and general memory access

    Memory chips pin count:

    DDR: 184 pins
    DDR2: 240 pins
    DDR3: 240 pins
    DDR4: 288 pins

    That must point to some parallel fetching of bytes.

  9. #9
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    36,473

    Re: memory access in Win32, win64 and general memory access

    Certainly. Quite a bit of it, really. If you read the second of the links I posted, there's the unit called the flit, which would be 80 bits wide, three of which would make 240. Whether or not that is relevant isn't clear to me, though.
    My usual boring signature: Nothing

  10. #10
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    6,379

    Re: memory access in Win32, win64 and general memory access

    Quote Originally Posted by Shaggy Hiker View Post
    I did move the question, because it's a very good question. While Chit-Chat gets visited by lots of people, you also get a bunch of flippant responses, and there is no requirement that anyone stay on topic. I didn't want this particular thread to wander.

    I suspect that the answer comes in different layers. Suppose we are looking for a particular integer. On a 32-bit system, that could be a 32 bit integer in four bytes, while on a 64-bit system that could be a 64-bit integer with eight bytes. Technically, those would be two different data types, but what I want to consider is a "thing that is the width of the register", and am using Integer for convenience.

    If that integer is on the disc, and not anywhere in RAM, then I believe it will not be JUST the integer that gets loaded. I believe it will be the integer, and a large block of memory that contains the integer that gets loaded to RAM. What this means is that, if you then want to find a second integer that is in the same block, it is now in RAM, not on the disc, which means faster access.

    On modern CPUs, my understanding is that the CPU cache works in essentially the same way, though I forget the block size (probably 4K, or some other power of 2 times 1000). Therefore, when you want your Integer, which is now in RAM, the Integer is moved to the cache, along with the Block that is around it. That puts ALL of that memory on the CPU, thereby speeding up access for anything that is in the cache. That works well for lots of things, because bytes in memory tend to be related to those that are near them, such that if you want N, then you are likely to also be wanting M and O in the near future.

    From there, when you move your Integer into a register, you get the whole thing in one shot. It isn't byte0, then byte1 then byte2, and so on. For this reason, it is often a bit faster to use Integers over bytes, despite the byte being smaller. If you have 32 bit registers, and you get a byte, you are going to get 32 bits and have to discard the top 24.

    I may have some things mixed up. It has been over a decade since I've looked at anything with CPUs at that level, but I believe I am at least pretty close, and others can improve on the answer.
    This is a good answer but I feel some things could be clarified.

    CPU cache, if you're talking about L1 and L2 cache etc, has little to do with whether a CPU is 32 bit or 64 bit. Accessing instructions and data from RAM is very expensive so these caches are used for instructions and data that are frequently accessed. These caches exist on the CPU itself which is why they are blazingly fast. The bitness of a processor, for lack of a better word, can influence the way these caches work or the way they are designed but these caches really don't care about that. They work more or less the same way regardless of what kind of processor it is.

    As for what makes a 64 bit processor, a 64 bit processor. Well, this can be quite involved but a general rule of thumb is to look at the size of the general purpose registers. If these registers are 64 bits wide then the CPU can be considered a 64 bit CPU. Also, we should never assume that all CPUs use the full 64 bits to address memory. You will find that a lot of modern 64 bit processors actually use 48 bits for addressing memory. This is not guaranteed though, some really do use all 64 bits and I believe that some can be configured to use 64 bits or some lesser value like 48 bits or 56 bits. It's all very confusing when you really dig into this. There is no standard for any of this. It's very implementation specific and you will find all kinds of wild and crazy schemes out there. Thankfully, programmers don't have to care about this 99% of the time, even assembly language programmers. This is extremely low level stuff that matters to very few people.

    Now lets get to the meat of the issue, data types like Integer. Now it's very important to understand this, data types like Integers are compiler specific. They are implemented by compilers, not by whatever processor you are using. An Integer type in a high level language is 32 bits because the compiler says it's 32 bits. However, compilers can be clever little critters. There is nothing stopping a compiler from representing a single Integer type as 32 bits sometimes and as 64 bits other times. C/C++ compilers use typedefs, conditional compilation and macros to implement this kind of behavior. Compilers like the JIT compiler of the .Net CLR are programmed to treat the IntPtr type as 32 bits when the application is executed as a 32 application and 64 bits when it's executed as a 64 bit application. Let's dig a little deeper into what goes on here.

    So how does a compiler implement Integers. Well, remember what I mentioned about general purpose registers? Compilers typically match their Integer sizes to the sizes of these general purpose registers. But there is something you must understand about these registers. They can be partitioned into smaller sections. The most commonly used general purpose registers on a typical 64 bit x86 processor would be RAX, RBX, RCX, RDX. There are more but lets just keep it simple for the purposes of this explanation. Now each of these registers can be treated as 32 bit registers and they can be treated as 16 bit registers. If you wanted to use the RAX register as a 32 bit register for example, you'd refer to it as EAX. If you access the EAX register on a 64 bit CPU, the CPU would ignore the upper 32 bits and only use the lower 32 bits. If you referred to it as AX, it would only deal with the lower 16 bits of the register and ignore the upper 48 bits. This is the main reason why 32 bit code can be executed on a 64 bit operating system. There is a little extra work involved for addressing memory as a 32 bit program on a 64 bit CPU but lets ignore that to keep it simple.

    So lets see what compilers do to implement Integer types:-
    Code:
    Dim a As Integer
    a = 5
    a = a + 1
    Let's pretend the above is some VB6 code. An Integer in VB6 is 16 bits. It is 16 bits because the compiler says it's 16 bits. The VB6 compiler may produce something that looks like this:-
    Code:
    MOV AX, [a] ; Move the value 5 from variable a to the AX register
    ADD AX, 1    ; Add 1 to the AX register
    MOV [a], AX ; Move the result of the addition back to the variable a
    What is significant here is that the compiler is using the AX register, which as I said above is just the lower 16 bits of the RAX register on a 64 bit CPU. This can have some performance implications but I won't get into all that because that is an entire topic by itself. What is important to understand here is that whenever the VB6 compiler encounters the Integer type, it will use the AX register whenever it has to perform operations on it like addition, subtractions, bit shifts etc. This is essentially what makes it 16 bits, it's the use of the AX register. If the .Net JIT was to compile that same VB code it would produce something like this:-
    Code:
    MOV EAX, [a] ; Move the value 5 from variable a to the EAX register
    ADD EAX, 1    ; Add 1 to the EAX register
    MOV [a], EAX ; Move the result of the addition back to the variable a
    The Integer type in the .NET Runtime is 32 bits wide so the compiler would move it around in the EAX register which is just the lower 32 bits of the 64 bit RAX register. When the .Net JIT encounters a type like the IntPtr type, it will check to see if it's running as a 64 bit process or a 32 bit process and produce the appropriate machine code based on this. This is what data types really come down to.

    I want to quickly address what was said in the OP. I don't really know what he is asking but he mentions something about accessing memory. How a CPU accesses memory can be a very deep topic and I really have no idea how deep he wants to go. I don't really know about all the deep details so I'll just talk about what happens at the assembly/machine language level. On a 64 bit CPU, the most amount of data that can be accessed from memory in a single instruction is 64 bits. On a 32 bit CPU, it will be 32 bits at a time at most. How that actually happens on a lower level can become quite complicated. At the hardware level you have to consider things like bottlenecks and bus sizes. For example, it's entirely possible for a the bus size between a 64 bit CPU and memory to be 32 bits wide, which means that even if you are using a single assembly instruction to fetch 64 bits of data from memory, the hardware would actually be performing two fetches. Note that this is an oversimplification. There is a whole lot more going on at the hardware level. It can get really complicated very quickly.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  11. #11
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    6,379

    Re: memory access in Win32, win64 and general memory access

    Quote Originally Posted by Eduardo- View Post
    I mean if when copying from the RAM memory module to the CPU cache, it is done per single bytes or several bytes at once (in parallel)
    If there is a single bus between the RAM and the CPU cache, the bus size would determine how much data gets transferred per cycle. If there are a chain of busses between the RAM and the CPU cache, the smallest of those busses will determine how much can be transferred per cycle.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width