n-channel

Unnamed repository; edit this file 'description' to name the repository.
Log | Files | Refs

commit f9791fff7b18267241b2f22feec68b5c71f677c8
parent 19f5f4f9f49c3218525f3221c995785609ef043b
Author: Samdal <samdal@protonmail.com>
Date:   Sat, 22 Feb 2025 00:13:32 +0100

add generic data structures post

Diffstat:
M_posts/2025-02-20-hello-world.md | 2+-
M_posts/2025-02-21-Software-Rants.md | 2+-
A_posts/2025-02-22-making-generic-data-structures-in-C.md | 228+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 230 insertions(+), 2 deletions(-)

diff --git a/_posts/2025-02-20-hello-world.md b/_posts/2025-02-20-hello-world.md @@ -4,7 +4,7 @@ title: Hello World description: Hello World #summary: What is the difference between various font formats? comments: true -tags: [writing] +tags: [misc] --- Hello world! This day marks the creation of this website. diff --git a/_posts/2025-02-21-Software-Rants.md b/_posts/2025-02-21-Software-Rants.md @@ -3,7 +3,7 @@ layout: post title: Software rant compilation description: A Compilation of rants about software, programming and OOP comments: true -tags: [writing, programming] +tags: [misc, programming] --- This is where I have my compilation of good programming rants worth watching. diff --git a/_posts/2025-02-22-making-generic-data-structures-in-C.md b/_posts/2025-02-22-making-generic-data-structures-in-C.md @@ -0,0 +1,228 @@ +--- +layout: post +title: Generic data structures in C +description: How to make generic data structures in C +#summary: What is the difference between various font formats? +comments: true +tags: [writing, programming, C] +--- + +Static typing is amazing, it reduces a lot of bugs. However, having too many types hinders the uniformity of your code. + +Modern languages have tooling to reduce the amount of work you have to do when your code base has a lot of similar types. +Although I don't always agree that creating and working with these types is a good idea, I do use generics in C as well. + +## How it's normally done in C + +Usually C programmers just use `void*` and then (hopefully) wrap functoins around the operations. +This has many issues, mainly that it is hard to work with, and has no type safety. + +> Note that in cases where you don't need type safety or easy top-level operations, `void*` is an excellent tool. + + +Example of a `void*` based linked list: +``` +typedef struct generic_linked_list generic_linked_list; +struct generic_linked_list { + generic_linked_list* next; + void* data; +}; + +generic_linked_list* create_node(void* data, size_t data_size_bytes) { + generic_linked_list* res = malloc(sizeof(*res)); + res->next = 0; + res->data = malloc(data_size_bytes); + memcpy(res->data, data, data_size_bytes); +} + +void llist_stack_push(generic_linked_list** first_node, void *data, size_t data_size_bytes) +{ + generic_linked_list* new_node = create_node(data, data_size_bytes); + new_node->next = *first_node; + *first_node = new_node; +} +``` + +## The simple way + +Instead of using `void*`, just create macros that assume member names. +``` +#define llist_queue_push(f,l,n) ((f)==0) ? (f)=(l)=(n) : ((l)->next=(n), (l)=(n), (n)->next=0) +#define llist_queue_pop(f,l) ((f)==(l)) ? ((f)=(l)=0) : ((f)=(f)->next) + +#define llist_stack_push(f,n) ((n)->next=(f), (f)=(n)) +#define llist_stack_pop(f) ((f)==0) ? 0 : ((f)=(f->next)) +``` + +This requires you to define the structures yourself. +``` +typedef struct llist_int llist_int; +struct llist_int { + llist_int* next; + int val; +} +``` + +Then you just use it like this: +``` +llist_int* list_first; + +llist_int* n = malloc((sizeof(*n))); +n->val = 10; +llist_stack_push(list_first, n); +``` + +It's a lot simpler, it's type safe and it gives the user control of memory management. + +Meanwhile, with the `void*` method you have to do `*(int*)list_first->data` just to read the memory. + +Having more control over datatypes also makes it easier to compose larger structures. +``` +typedef struct tree_node tree_node; +struct tree_node { + tree_node* parent; + + tree_node* child_first; + tree_node* child_last; + + tree_node* next; + tree_node* prev; + + int val; +} +``` +We can use this struct as is, we don't need all of the boiler plate that comes with a different solution. + +## The "hide away" method + +You can also hide away your header information, and then provide the type in a nicer way to the user. +``` +typedef struct { + int sz; +} _dyn_arr_header; +#define dyn_arr(T) T* + +#define dyn_arr_get_sz(arr) (((_dyn_arr_header*)((u8*)(arr) - sizeof(_dyn_arr_header)))->sz) + +#define dyn_arr_push(arr, new_val) ( \ + arr = realloc(sizeof(_dyn_arr_header) + dyn_arr_get_sz(arr) + 1), \ + arr[dyn_arr_get_sz(arr)-1] = new_val \ + ) +``` +``` +dyn_arr(float) my_arr = NULL; + +dyn_arr_push(my_arr, 10.0f); +dyn_arr_push(my_arr, 20.0f); + +assert(my_arr[1] == 20.0f);; +``` +I don't really use dynamic arrays anymore, and that's mostly where this technique is useful. + + +## Going a bit crazy + +Although just the fact that using generic operations rather than types is the main takeaway from this post, I figiured I should show how you can take this to the extreme. +So let's make a hash-table. + +To make it easier to use, we can provide macros that generate the correct structures. +``` +#define htable_def_ex(_htable_entry_T) \ + struct { \ + u64 table_sz; \ + _htable_entry_T** table; \ + } + +// if you don't care about having anonymous types, you can use this: +#define htable_def(_T) htable_def_ex(struct {u64 hash; void* next; _T val;}) +``` +Then we make our operations wrapped in generic macros like before. This time we make use of internal functions, and process the input and output of them. +``` +typedef struct {u64 hash; void* next;} null_htable_entry_t; +typedef htable_def_ex(null_htable_entry_t) _null_htable_t; + +#define htable_find(_HTABLE, _KEY) \ + ((_HTABLE) ? ( \ + (typeof((_HTABLE)->table[0]))_htable_find((_HTABLE), (_KEY)) \ + ) : NULL) + +void* _htable_find(void* htable, uint64_t key) +{ + _null_htable_t* ht = htable; + uint64_t index = key % ht->table_sz; + + for (null_htable_entry_header_t* it = ht->table[index]; it; it = it->next) { + if (it->hash == key) return (u8*)it + sizeof(null_htable_entry_t); + } + return NULL; +} + +#define htable_find_or_create(_HTABLE, _KEY)\ + ((typeof((_HTABLE)->table[0]))_htable_find_or_create(_HTABLE, sizeof(*(_HTABLE)->tmp_list), _KEY)) + +void* _htable_find_or_create(void* htable, size_t entry_sz, uint64_t key) +{ + null_htable_t* ht = htable; + null_htable_entry_t* result = _htable_find(htable, key); + + if (!result) { + uint64_t index = key % ht->table_sz; + + result = malloc(sizeof(null_htable_entry_t) + entry_sz); + + result->next = ht->table[index]; + result->hash = key; + ht->table[index] = result; + } + return result; +} +``` + +Usage code: +``` +htable_def(int) int_htable; +int_htable id_table = { + .table_sz = 100, + .table = malloc(1001 * sizeof(*id_table.table)), +}; + +{ + int* res = htable_find_or_create(&id_table, key); + *res = 10; +} + +{ + int* res = htable_find(&id_table, key); + assert(res); + assert(res == 10); +} +``` + +If you wish to avoid using `typeof`, you can make another struct member which is used as the "return" value, like this: +``` +#define htable_def_ex(_htable_entry_T) \ + struct { \ + u64 table_sz; \ + _htable_entry_T** table; \ + _htable_entry_T* last_find; \ + } + +#define htable_find(_HTABLE, _KEY) \ + ((_HTABLE) ? ( \ + (_HTABLE)->last_find = _htable_find((_HTABLE), (_KEY)) \ + ) : NULL) +``` + +## What you should remember +The best generic data structure is no generic data structure. Nowadays I pretty much exclusively use linked lists. This is because they are flexible, pointer stable and thereby play nicely with Arenas. The hash-table we made also plays well with Arenas. + +I very rarely find myself longing for a dynamic array, a hash-table or other more complicated data structures. +In the cases where you actually would benefit from having a re-usable generic data structure, don't do it the `void*` way. +Try doing the types raw and focus on making the operations "generic". + +You will be surprised at how much it reduces friction and boilerplate. + +<br> + +For the non-C people reading, you might have become thankful that your tooling does generics automatically. +I think of it differently, one of the beauties of C is that complicated things are hard to do. Experiencing that hardship steers you away from those solutions, and directs you towards simplicity.