Let’s add some more stores and store combinators to Matryoshka, my composable storage library in Elixir.
Storage Combinators proposes a Mapping Store which applies transformations on references and values:
The Mapping Store is an abstract superclass modelled after a map() function or a Unix filter, applying simple transformations to its inputs to yield its outputs when communicating with its source. Due to the fact that stores have a slightly richer protocol than functions or filters, the mapping store has to perform three separate mappings:
- Map the reference before passing it to the source.
- Map the data that is read from the source after it is read.
- Map the data that is written to the source, before it is written.
This would be pretty useful for all sorts of deserialization / serialization stores: we could use regular functions to translate Elixir types back and forth into JSON (or XML, or S-expressions, or CSV…, ad infinitum) to store data on disk.
First things first, we’ll define the module and a type to store the three mapping functions:
map_ref
is a function that maps references (ref -> mapped_ref)
before using them to locate valuesmap_retrieved
is a function (stored_value -> value)
that maps values when retrieved (get/fetch) from the storemap_to_store
is a function (value -> stored_value)
that maps values before storing themI’ve only enforced the inner
field for the struct. If the function isn’t provided in the struct, I’ll default to the identity function (which returns its input value unchanged), so the reference/value won’t be transformed. This is a bit more convenient when defining mapping stores where we only want to map the reference, or only want to map the values on storage and retrieval.
/lib/matryoshka/impl/mapping_store.ex
1defmodule Matryoshka.Impl.MappingStore do
2 alias Matryoshka.IsStorage
3 alias Matryoshka.Storage
4 alias Matryoshka.Reference
5
6 @identity &Function.identity/1
7
8 @enforce_keys [:inner]
9 defstruct [
10 :inner,
11 map_ref: @identity,
12 map_retrieved: @identity,
13 map_to_store: @identity
14 ]
15
16 @type t :: %__MODULE__{
17 inner: IsStorage.t(),
18 map_ref: (Reference.t() -> Reference.t()),
19 map_retrieved: (any() -> any()),
20 map_to_store: (any() -> any())
21 }
22 ...
I’ll also add a helper function mapping_store/2
to create the struct. We can change the mapping functions in the MappingStore by providing the functions as keywords.
/lib/matryoshka/impl/mapping_store.ex
22 ...
23 def mapping_store(inner, opts \\ []) do
24 map_ref = Keyword.get(opts, :map_ref, @identity)
25 map_retrieved = Keyword.get(opts, :map_retrieved, @identity)
26 map_to_store = Keyword.get(opts, :map_to_store, @identity)
27
28 %__MODULE__{
29 inner: inner,
30 map_ref: map_ref,
31 map_retrieved: map_retrieved,
32 map_to_store: map_to_store
33 }
34 end
35 ...
Now we just need to define the Storage protocol for the module. Like all our storage combinators, we’re calling the Storage functions on the inner store, but we also call the mapping functions where necessary, i.e.:
map_ref/1
map_retrieved/1
map_to_store/1
/lib/matryoshka/impl/mapping_store.ex
35 ...
36 alias __MODULE__
37
38 defimpl Storage do
39 def fetch(store, ref) do
40 value =
41 Storage.fetch(
42 store.inner,
43 store.map_ref.(ref)
44 )
45 value_new =
46 case value do
47 {:ok, value} -> {:ok, store.map_retrieved.(value)}
48 error -> error
49 end
50
51 value_new
52 end
53
54 def get(store, ref) do
55 value =
56 Storage.get(
57 store.inner,
58 store.map_ref.(ref)
59 )
60 value_new =
61 case value do
62 nil -> nil
63 value -> store.map_retrieved.(value)
64 end
65
66 value_new
67 end
68
69 def put(store, ref, value) do
70 inner_new =
71 Storage.put(
72 store.inner,
73 store.map_ref.(ref),
74 store.map_to_store.(value)
75 )
76
77 %{store | inner: inner_new}
78 end
79
80 def delete(store, ref) do
81 inner_new =
82 Storage.delete(
83 store.inner,
84 store.map_ref.(ref)
85 )
86 %{store | inner: inner_new}
87 end
88 end
89end
Storage Combinators defines a switching store which distributes requests to subsidiary stores. In the first post of this series, I mentioned that:
In cases where we would use a scheme in a URI, we can simply use the first path segment
So the idea behind my implementation is as follows:
We’ll keep the stores in a map of strings to stores underneath a struct:
/lib/matryoshka/impl/switching_store.ex
1defmodule Matryoshka.Impl.SwitchingStore do
2 alias Matryoshka.Impl.SwitchingStore
3 alias Matryoshka.Reference
4 alias Matryoshka.Storage
5 alias Matryoshka.IsStorage
6
7 @enforce_keys [:path_store_map]
8 defstruct @enforce_keys
9
10 @type t :: %__MODULE__{
11 path_store_map: %{String.t() => IsStorage.t()}
12 }
13
14 def switching_store(path_store_map) when is_map(path_store_map) do
15 %__MODULE__{
16 path_store_map: path_store_map
17 }
18 end
19 ...
Before defining the implementations for storage, let’s get some helper functions defined.
I want a function to update the path store map whenever we update an inner store. This just needs to reach into the underlying path_store_map
and put the updated store there:
/lib/matryoshka/impl/switching_store.ex
19 ...
20 alias __MODULE__
21
22 def update_substore(store, sub_store, sub_store_ref) do
23 store.path_store_map
24 |> Map.put(sub_store_ref, sub_store)
25 |> SwitchingStore.switching_store()
26 end
27 ...
I’d also like a function to split a reference into two references:
/
)/lib/matryoshka/impl/switching_store.ex
27 ...
28 def split_reference(ref) do
29 [path_head | path_tail] = Reference.path_segments(ref)
30
31 case path_tail do
32 [] -> {:error, {:ref_path_too_short, ref}}
33 path -> {:ok, path_head, Enum.join(path, "/")}
34 end
35 end
36 ...
To make life more convenient, I also want a function which:
/lib/matryoshka/impl/switching_store.ex
36 ...
37 def locate_substore(store, ref) do
38 with {:split_ref, {:ok, path_first, path_rest}} <-
39 {:split_ref, SwitchingStore.split_reference(ref)},
40 {:fetch_substore, {:ok, sub_store}} <-
41 {:fetch_substore, Map.fetch(store.path_store_map, path_first)} do
42 {:ok, sub_store, path_first, path_rest}
43 else
44 {:split_ref, error} -> error
45 {:fetch_substore, :error} -> {:error, :no_substore}
46 end
47 end
48 ...
Now with those helper functions out of the way, we can define the methods for Storage. The basic gist is the same across all the methods:
/lib/matryoshka/impl/switching_store.ex
48 ...
49 defimpl Storage do
50 def fetch(store, ref) do
51 with {:locate, {:ok, sub_store, path_first, path_rest}} <-
52 {:locate, SwitchingStore.locate_substore(store, ref)},
53 {:fetch, {:ok, value}} <-
54 {:fetch, Storage.fetch(sub_store, path_rest)} do
55 {:ok, value}
56 else
57 {:locate, error} -> error
58 {:fetch, error} -> error
59 end
60 end
61
62 def get(store, ref) do
63 with {:ok, sub_store, path_first, path_rest} <-
64 SwitchingStore.locate_substore(store, ref) do
65 value = Storage.get(sub_store, path_rest)
66 value
67 else
68 _error -> nil
69 end
70 end
71
72 def put(store, ref, value) do
73 with {:ok, sub_store, path_first, path_rest} <-
74 SwitchingStore.locate_substore(store, ref) do
75 new_sub_store = Storage.put(sub_store, path_rest, value)
76 SwitchingStore.update_substore(store, new_sub_store, path_first)
77 else
78 _ -> store
79 end
80 end
81
82 def delete(store, ref) do
83 with {:ok, sub_store, path_first, path_rest} <-
84 SwitchingStore.locate_substore(store, ref) do
85 new_sub_store = Storage.delete(sub_store, path_rest)
86 SwitchingStore.update_substore(store, new_sub_store, path_first)
87 else
88 _ -> store
89 end
90 end
91 end
92end
With SwitchingStore, we’ve broken the ground on store combinators that compose an arbitrarily large number of stores. Let’s continue with a BackupStore, which will retrieve values only from a main store, but store values in both the main store and a list of target stores.
Once again we start with a struct and a helper function to construct the struct:
/lib/matryoshka/impl/backup_store.ex
1defmodule Matryoshka.Impl.BackupStore do
2 alias Matryoshka.IsStorage
3 alias Matryoshka.Storage
4
5 @enforce_keys [:source_store, :target_stores]
6 defstruct @enforce_keys
7
8 @type t :: %__MODULE__{
9 source_store: IsStorage.t(),
10 target_stores: list(IsStorage.t())
11 }
12
13 def backup_store(source_store, target_stores)
14 when is_struct(source_store) and is_list(target_stores) do
15 %__MODULE__{
16 source_store: source_store,
17 target_stores: target_stores
18 }
19 end
20 ...
Now let’s define the Storage functionality. fetch/2
and get/2
just delegate their calls to the inner source store:
/lib/matryoshka/impl/backup_store.ex
20 ...
21 alias __MODULE__
22
23 defimpl Storage do
24 def fetch(store, ref) do
25 Storage.fetch(store.source_store, ref)
26 end
27
28 def get(store, ref) do
29 Storage.get(store.source_store, ref)
30 end
31 ...
While put/3
and delete/2
map over the source and target stores, then wrap up the updated stores into the BackupStore struct:
/lib/matryoshka/impl/backup_store.ex
31 ...
32 def put(store, ref, value) do
33 source_store = Storage.put(store.source_store, ref, value)
34 target_stores = Enum.map(
35 store.target_stores,
36 fn store -> Storage.put(store, ref, value) end
37 )
38 BackupStore.backup_store(source_store, target_stores)
39 end
40
41 def delete(store, ref) do
42 source_store = Storage.delete(store.source_store, ref)
43 target_stores = Enum.map(
44 store.target_stores,
45 fn store -> Storage.delete(store, ref) end
46 )
47 BackupStore.backup_store(source_store, target_stores)
48 end
49 end
50end
The BackupStore is useful for keeping auxiliary stores updated with values, but we can never actually use those backup stores to retrieve values. It would be nice to use those alternate stores when the first store we check doesn’t have the value, so let’s create a CachingStore that caches data with the following requirements:
nil
or an {:error, reason}
Ah, but we run into an issue; fetch/2
and get/2
only return a value, they don’t return the store. That means we can’t mutate the CachingStore on gets and fetches, as we’d need to mutate the cache store inside. But we can fix that pretty easily by requiring fetch/2
and get/2
to return a tuple of {store, value}
instead of just value
. That way, we can have CachingStore update the cache store when the value is retrieved from the main store.
There are some drawbacks to this update:
fetch/2
and get/2
implementations as well
But I think it’s well worth it for caching.
Once more, we start with a struct and a helper function caching_store/2
to build the struct. I’ve also specialised the constructor function into caching_store/1
, which defaults to using a MapStore as the fast cache store.
/lib/matryoshka/impl/caching_store.ex
1defmodule Matryoshka.Impl.CachingStore do
2 alias Matryoshka.IsStorage
3 alias Matryoshka.Storage
4 import Matryoshka.Impl.MapStore, only: [map_store: 0]
5
6 @enforce_keys [:main_store, :cache_store]
7 defstruct [:main_store, :cache_store]
8
9 @type t :: %__MODULE__{
10 main_store: IsStorage.t(),
11 cache_store: IsStorage.t()
12 }
13
14 def caching_store(main_storage), do: caching_store(main_storage, map_store())
15
16 def caching_store(main_storage, fast_storage)
17 when is_struct(main_storage) and is_struct(fast_storage) do
18 %__MODULE__{main_store: main_storage, cache_store: fast_storage}
19 end
20 ...
Both fetch/2
and get/2
follow the same general idea:
{store, value}
{store, value}
nil
or {:error, reason}
in the shape {store, error_value}
/lib/matryoshka/impl/caching_store.ex
20 ...
21 alias __MODULE__
22
23 defimpl Storage do
24 def fetch(store, ref) do
25 {cache_store_new, val_fast} = Storage.fetch(store.cache_store, ref)
26
27 case val_fast do
28 {:ok, _value} ->
29 new_store = %{store | cache_store: cache_store_new}
30 {new_store, val_fast}
31
32 {:error, _reason_fast} ->
33 {main_store_new, val_main} = Storage.fetch(store.main_store, ref)
34
35 case val_main do
36 {:ok, value} ->
37 cache_store_new = Storage.put(cache_store_new, ref, value)
38 new_store = CachingStore.caching_store(
39 main_store_new,
40 cache_store_new
41 )
42 {new_store, val_main}
43
44 {:error, reason} ->
45 {store, {:error, reason}}
46 end
47 end
48 end
49
50 def get(store, ref) do
51 {cache_store_new, val_fast} = Storage.get(store.cache_store, ref)
52
53 case val_fast do
54 nil ->
55 {main_store_new, val_main} = Storage.get(store.main_store, ref)
56
57 case val_main do
58 nil ->
59 store_new = CachingStore.caching_store(
60 main_store_new,
61 cache_store_new
62 )
63 {store_new, nil}
64
65 value ->
66 cache_store_new = Storage.put(cache_store_new, ref, value)
67 store_new = CachingStore.caching_store(
68 main_store_new,
69 cache_store_new
70 )
71 {store_new, value}
72 end
73
74 value ->
75 store_new = %{store | cache_store: cache_store_new}
76 {store_new, value}
77 end
78 end
79 ...
The code for put/3
and delete/2
on the other hand is much easier. We update both stores (main and cache), then wrap them into a CachingStore struct:
/lib/matryoshka/impl/caching_store.ex
79 ...
80 def put(store, ref, value) do
81 main_store = Storage.put(store.main_store, ref, value)
82 cache_store = Storage.put(store.cache_store, ref, value)
83 CachingStore.caching_store(main_store, cache_store)
84 end
85
86 def delete(store, ref) do
87 main_store = Storage.delete(store.main_store, ref)
88 cache_store = Storage.delete(store.cache_store, ref)
89 CachingStore.caching_store(main_store, cache_store)
90 end
91 end
92end
…and that’s CachingStore done.
Great, we’ve defined the business logic for a few new useful store combinators, which means that it’s time to expose them in the Matryoshka module:
/lib/matryoshka.ex
17 ...
18 # Business logic
19 defdelegate backup_store(source_store, target_stores), to: BackupStore
20 defdelegate caching_store(main_store), to: CachingStore
21 defdelegate caching_store(main_store, cache_store), to: CachingStore
22 defdelegate logging_store(store), to: LoggingStore
23 defdelegate map_store(), to: MapStore
24 defdelegate map_store(map), to: MapStore
25 defdelegate mapping_store(store, opts), to: MappingStore
26 defdelegate pass_through(store), to: PassThrough
27 defdelegate switching_store(path_store_map), to: SwitchingStore
28end
There’s a glaring issue when it comes to using Matryoshka as a storage backend that I’ve not discussed yet.
We’ve got a bunch of storage combinators to add all sorts of functionality, which is great, but all our stores so far have been in-memory only; so we lose all the data when the store closes (i.e. because the store BEAM process terminates).
Now that we’ve implemented CachingStore, we have the ability to cache data using a fast store (which we can keep in-memory) with a backup main store (which we’ll keep on disk). So I think it’s high time we add stores that persist data to disk.
We’ll be doing that in the next post in this series.
You can see the latest version of Matryoshka at my GitHub.