Английская Википедия:False sharing

Шаблон:Short description In computer science, false sharing is a performance-degrading usage pattern that can arise in systems with distributed, coherent caches at the size of the smallest resource block managed by the caching mechanism. When a system participant attempts to periodically access data that is not being altered by another party, but that data shares a cache block with data that is being altered, the caching protocol may force the first participant to reload the whole cache block despite a lack of logical necessity.^[1] The caching system is unaware of activity within this block and forces the first participant to bear the caching system overhead required by true shared access of a resource.

Multiprocessor CPU caches

By far the most common usage of this term is in modern multiprocessor CPU caches, where memory is cached in lines of some small power of two word size (e.g., 64 aligned, contiguous bytes). If two processors operate on independent data in the same memory address region storable in a single line, the cache coherency mechanisms in the system may force the whole line across the bus or interconnect with every data write, forcing memory stalls in addition to wasting system bandwidth. In some cases, the elimination of false sharing can result in order-of-magnitude performance improvements.^[2] False sharing is an inherent artifact of automatically synchronized cache protocols and can also exist in environments such as distributed file systems or databases, but current prevalence is limited to RAM caches.

Example

#include <iostream>
#include <thread>
#include <new>
#include <atomic>
#include <chrono>
#include <latch>
#include <vector>

using namespace std;
using namespace chrono;

#if defined(__cpp_lib_hardware_interference_size)
// default cacheline size from runtime
constexpr size_t CL_SIZE = hardware_constructive_interference_size;
#else
// most common cacheline size otherwise
constexpr size_t CL_SIZE = 64;
#endif

int main()
{
    vector<jthread> threads;
    int hc = jthread::hardware_concurrency();
    hc = hc <= CL_SIZE ? hc : CL_SIZE;
    for( int nThreads = 1; nThreads <= hc; ++nThreads )
    {
        // synchronize beginning of threads coarse on kernel level
        latch coarseSync( nThreads );
        // fine synch via atomic in userspace
        atomic_uint fineSync( nThreads );
        // as much chars as would fit into a cacheline
        struct alignas(CL_SIZE) { char shareds[CL_SIZE]; } cacheLine;
        // sum of all threads execution times
        atomic_int64_t nsSum( 0 );
        for( int t = 0; t != nThreads; ++t )
            threads.emplace_back(
                [&]( char volatile &c )
                {
                    coarseSync.arrive_and_wait(); // synch beginning of thread execution on kernel-level
                    if( fineSync.fetch_sub( 1, memory_order::relaxed ) != 1 ) // fine-synch on user-level
                        while( fineSync.load( memory_order::relaxed ) );
                    auto start = high_resolution_clock::now();
                    for( size_t r = 10'000'000; r--; )
                        c = c + 1;
                    nsSum += duration_cast<nanoseconds>( high_resolution_clock::now() - start ).count();
                }, ref( cacheLine.shareds[t] ) );
        threads.resize( 0 ); // join all threads
        cout << nThreads << ": " << (int)(nsSum / (1.0e7 * nThreads) + 0.5) << endl;
    }
}

This code shows the effect of false sharing. It creates an increasing number of threads from one thread to the number of physical threads in the system. Each thread sequentially increments one byte of a cache line, which as a whole is shared among all threads. The higher the level of contention between threads, the longer each increment takes. This are the results on a Zen4 system with 16 cores and 32 threads:

As you can see, on the system in question it can take up to a 100 nanoseconds to complete an increment operation on the shared cache line, which corresponds to approx. 420 clock cycles on this CPU.

Mitigation

There are ways of mitigating the effects of false sharing. For instance, false sharing in CPU caches can be prevented by reordering variables or adding padding (unused bytes) between variables. However, some of these program changes may increase the size of the objects, leading to higher memory use.^[2] Compile-time data transformations can also mitigate false-sharing.^[3] However, some of these transformations may not always be allowed. For instance, the C++ programming language standard draft of C++23 mandates that data members must be laid out so that later members have higher addresses.^[4]

There are tools for detecting false sharing.^[5]^[6] There are also systems that both detect and repair false sharing in executing programs. However, these systems incur some execution overhead.^[7]^[8]

References

Шаблон:Reflist

External links

[Patterson_2012_p._537-1] Шаблон:Cite book

[Bolosky-2] 2,0 ^2,1 Шаблон:Cite journal

[Jeremiassen_Eggers_1995_pp._179–188-3] Шаблон:Cite journal

[4] Шаблон:Cite web

[5] Шаблон:Cite web

[6] Шаблон:Cite conference

[7] Шаблон:Cite conference

[8] Шаблон:Cite journal

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

Партнерские ресурсы
Криптовалюты	Обмен криптовалют - www.bestchange.ru Криптовалютная биржа CoinEx Криптовалютная биржа Binance HIVE OS - операционная система для майнинга e4pool - Мультивалютный пул для майнинга.
Магазины	AliExpress — глобальная виртуальная (в Интернете) торговая площадка, предоставляющая возможность покупать товары производителей из КНР; computeruniverse.net - Интернет-магазин компьютеров(Промо код 5 Евро на первую покупку:FWWC3ZKQ);
Хостинг	DigitalOcean - американский провайдер облачных инфраструктур, с главным офисом в Нью-Йорке и с центрами обработки данных по всему миру;
Разное	Викиум - Онлайн-тренажер для мозга Like Центр - Центр поддержки и развития предпринимательства. Gamersbay - лучший магазин по бустингу для World of Warcraft. Ноотропы OmniMind N°1 - Усиливает мозговую активность. Повышает мотивацию. Улучшает память. Санкт-Петербургская школа телевидения - это федеральная сеть образовательных центров, которая имеет филиалы в 37 городах России. Lingualeo.com — интерактивный онлайн-сервис для изучения и практики английского языка в увлекательной игровой форме. Junyschool (Джунискул) – международная школа программирования и дизайна для детей и подростков от 5 до 17 лет, где ученики осваивают компьютерную грамотность, развивают алгоритмическое и креативное мышление, изучают основы программирования и компьютерной графики, создают собственные проекты: игры, сайты, программы, приложения, анимации, 3D-модели, монтируют видео. Умназия - Интерактивные онлайн-курсы и тренажеры для развития мышления детей 6-13 лет SkillBox - это один из лидеров российского рынка онлайн-образования. Среди партнеров Skillbox ведущий разработчик сервисного дизайна AIC, медиа-компания Yoola, первое и самое крупное русскоязычное аналитическое агентство Tagline, онлайн-школа дизайна и иллюстрации Bang! Bang! Education, оператор PR-рынка PACO, студия рисования Draw&Go, агентство performance-маркетинга Ingate, scrum-студия Sibirix, имидж-лаборатория Персона. «Нетология» — это университет по подготовке и дополнительному обучению специалистов в области интернет-маркетинга, управления проектами и продуктами, дизайна, Data Science и разработки. В рамках Нетологии студенты получают ценные теоретические знания от лучших экспертов Рунета, выполняют практические задания на отработку полученных навыков, общаются с экспертами и единомышленниками. Познакомиться со всеми продуктами подробнее можно на сайте https://netology.ru, линейка курсов и профессий постоянно обновляется. StudyBay Brazil – это онлайн биржа для португалоговорящих студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт. Автор24 — самая большая в России площадка по написанию учебных работ: контрольные и курсовые работы, дипломы, рефераты, решение задач, отчеты по практике, а так же любой другой вид работы. Сервис сотрудничает с более 70 000 авторов. Более 1 000 000 работ уже выполнено. StudyBay – это онлайн биржа для англоязычных студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт.

Английская Википедия:False sharing

Содержание

Multiprocessor CPU caches

Example

Mitigation

References

External links

Навигация

Действия на странице

Действия на странице

Персональные инструменты

Навигация

Поиск

Инструменты