- Nov 4, 2005
- 1,855
- 0
- 0
From the errata of the Intel Quad cores:
So correct me if I'm wrong here --
Basically if a page hasn't been written to since its data was last loaded from or written
to memory, it's not dirty.
I.e. dirty = the cached copy is modified, but the main memory copy isn't yet updated to match.
So when a first CPU core writes to memory page X, that pages becomes
automatically marked dirty, right? Without software intervention (normally),
the page's dirty bit would normally get set by the CPU, right?
Or is it a software process to mark pages dirty always?
How / when would another CPU core write to that same page when that
page ISN'T marked dirty? Are they saying that BEFORE it gets a chance to
be marked dirty, the other core could write to it in its "still non dirty" state?
Why/when would another processor core (if it hasn't written to the page)
explicitly set that page's dirty bit?
How would the BIOS work around this problem? I don't get what the BIOS
could possibly do to make this situation better, unless they mean there's a
CPU microcode update for this problem, but if that was the case
wouldn't they just say that, and say that the BIOS or OS could load
corrective microcode patches/updates?
Anyway it sounds like the cache coherency must be wildly broken
if this can occur due to either a race condition between cores writing
to the same page, and/or cores affecting the same page's dirty bit.
If multiple cores CAN'T safely do I/O to the same cached copy of a
write-back memory page, doesn't that basically break the whole
usefulness of a memory write back data cache and cache coherency in general?
I don't recall what page sizes CAN be set to, but I seem to recall they're
often 4KBy, though can be bigger or maybe smaller too. It doesn't seem
that uncommon for multiple cores to be writing SOMEWHERE within the
same page of memory if it's containing some kind of common data
structure e.g. semaphores, counters, shared buffer space or something like
that.
If I am reading this correctly it doesn't seem to say that the problem
will not occur if the processor cores use atomic operation instructions
or refrain from writing to the SAME page cache lines or whatever.
So it seems like it'd be "generally unsafe" to write or manage a page
for two cores no matter HOW they did it e.g. even if areas of the page
were for the exclusive use of the individual cores.
AK43. Concurrent Multi-processor Writes to Non-dirty Page May Result in
Unpredictable Behavior
Problem: When a logical processor writes to a non-dirty page, and another logicalprocessor
either writes to the same non-dirty page or explicitly sets the dirty
bit in the corresponding page table entry, complex interaction with internal
processor activity may cause unpredictable system behavior.
Implication: This erratum may result in unpredictable system behavior and hang.
Workaround: It is possible for BIOS to contain a workaround for this erratum.
Status: For the steppings affected, see the Summary Tables of Changes.
So correct me if I'm wrong here --
Basically if a page hasn't been written to since its data was last loaded from or written
to memory, it's not dirty.
I.e. dirty = the cached copy is modified, but the main memory copy isn't yet updated to match.
So when a first CPU core writes to memory page X, that pages becomes
automatically marked dirty, right? Without software intervention (normally),
the page's dirty bit would normally get set by the CPU, right?
Or is it a software process to mark pages dirty always?
How / when would another CPU core write to that same page when that
page ISN'T marked dirty? Are they saying that BEFORE it gets a chance to
be marked dirty, the other core could write to it in its "still non dirty" state?
Why/when would another processor core (if it hasn't written to the page)
explicitly set that page's dirty bit?
How would the BIOS work around this problem? I don't get what the BIOS
could possibly do to make this situation better, unless they mean there's a
CPU microcode update for this problem, but if that was the case
wouldn't they just say that, and say that the BIOS or OS could load
corrective microcode patches/updates?
Anyway it sounds like the cache coherency must be wildly broken
if this can occur due to either a race condition between cores writing
to the same page, and/or cores affecting the same page's dirty bit.
If multiple cores CAN'T safely do I/O to the same cached copy of a
write-back memory page, doesn't that basically break the whole
usefulness of a memory write back data cache and cache coherency in general?
I don't recall what page sizes CAN be set to, but I seem to recall they're
often 4KBy, though can be bigger or maybe smaller too. It doesn't seem
that uncommon for multiple cores to be writing SOMEWHERE within the
same page of memory if it's containing some kind of common data
structure e.g. semaphores, counters, shared buffer space or something like
that.
If I am reading this correctly it doesn't seem to say that the problem
will not occur if the processor cores use atomic operation instructions
or refrain from writing to the SAME page cache lines or whatever.
So it seems like it'd be "generally unsafe" to write or manage a page
for two cores no matter HOW they did it e.g. even if areas of the page
were for the exclusive use of the individual cores.