Somewhat relational database using wiredtiger (and else)

guile-user

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Somewhat relational database using wiredtiger (and else)

From:	Amirouche Boubekki
Subject:	Somewhat relational database using wiredtiger (and else)
Date:	Sun, 26 Jun 2016 23:10:10 +0200
User-agent:	Roundcube Webmail/1.1.2

Héllo all,

# Click bait

I've written an article trying to explicit a workflow similar
to the one used in RDBMS. It's very natural to do this in wiredtiger
even if you lake the high level abstractions of a SQL DSL. Through
a few procedures I explain that it's a simple to:

- Define tables with simple indices
- Insert new rows in tables
- Resolve foreign keys via an index
- Pagination is basically smart calls to list-tail and list-head

I think this covers all basic uses of RDBMS except multithreading
and transactions. If you think I missed basic usage of databases
please tell me!

Have a look athttp://hyperdev.fr/notes/somewhat-relational-database-library-using-wiredtiger.html


## Transactions

wiredtiger support transactions have a look at the source of
wiredtigerz.scm for a quick introduction.

## Multithreading and Multiprocessing

wiredtiger doesn't support multiprocessing but does support
multithreading.

I think that a good start if you require multiprocessing is to
create a database server similar to UAV database server [0]
which relies on eval. If you have security concerns you can create
a database server that does RPC via stored procedures. And since
`read' and `write' are not safe either, you might find msgpack
scheme port useful [1].

[0] https://framagit.org/a-guile-mind/hyper/blob/master/uav.scm#L190

[1]https://framagit.org/a-guile-mind/guile-wiredtiger/blob/master/msgpack.scm


# Where to go from here

Guile bindings of wiredtiger 2.6.1 can be fetched using
the following command:

git clone https://framagit.org/a-guile-mind/guile-wiredtiger.git


wiredtiger itself is still available online via:

wget http://source.wiredtiger.com/releases/wiredtiger-2.6.1.tar.bz2


It's very simple to install and the guix recipe is trivial ;)

Becarful only 64bit arch are supported by wiredtiger.

## Reading code

I prototyped a few things with this bindings:

- I mocked an hyper graph database called culture [2]

- Tuple space database (with SPARQL-like querying (supported byminikanren)) called UAV database [3]- UAV database is used to build a nanoblog, a twitter-like blogging webapp [4]. nanoblog exist in an artanis version and plain Guile web.

- A search engine based on UAV database called hyper [5]

[2]https://github.com/amirouche/Culturia/blob/master/culturia/culture.md

[3] https://framagit.org/a-guile-mind/hyper/blob/master/uav.scm
[4] https://git.framasoft.org/a-guile-mind/nanoblog/
[5] https://framagit.org/a-guile-mind/hyper

## Reading frenglish

I wrote two other articles about this bindings on my blog:

- Getting started with guile-wiredtiger [6]
- Getting started with UAV database [7]

[6] http://hyperdev.fr/notes/getting-started-with-guile-wiredtiger.html

[7]http://hyperdev.fr/notes/getting-started-with-guile-uav-database.html


## Where Do *I* Go From Here

From my perspective using guile-wiredtiger is the most convenient way
to create database backed applications in Guile but I know the API very
well.

I also read good things about it. I also compared it against bsddb and
leveldb which have less features and are slower.

It lakes true mutlithreading story right now, but it's I think not
too complex enough to come up with a mutlithreaded server that use
msgpack to transport the query with its params and bind the query
to the params using eval server side. I think it's secure to do that
this way.
I do not have the motivation to code it right now because of the
lake of feedback. Remember, maybe I'm drinking my kool-aid...

Don't be fooled by the fact that it was *recently* acquired by mongodb.
It's is only the primary backend of mongodb since 3.2 (IIRC) and there
isn't much feedback on mongodb since then.

Also wiredtiger is *not* mongodb.

(beware there is a lot of buzz words in what follows).

Regarding hyper, the search engine. I've been thinking about moving

the database to RDBMS style. This sounds more wiredtiger nativesolution.

The problem with that solution is that I think the wiredtiger API is
more difficult to understand in the context of a search engine than
the UAV database which is tuple database with a document oriented API.
Similarly UAV database is not as good as a graph database when it comes
to dealing with graphs. Again maybe I drink too much of my own kool-aid
but everything is a graph!

I want hyper to be the hackable search engine of the Culture [*], assuchit should be as simple as possible to manipulate the part of theInternet thatis scraped (and enriched) with *Scheme*. Yes, I want to interact with mydatausing scheme code primarily. Doing simple search queries is a solvedproblem.

The big problem is to enrich the search engine semantic and make it

as simple as possible to so. Making as simple as possible to hack onhyper.


[*] http://cultureandempire.com/

That's what I've been looking for all along. So my current plan is to

experiment with a graph frontend [8] for wiredtiger. If the currentfeaturesof hyper looks nicer with graph API I'll then think about rebasing thegraphfrontend on top the UAV layer. This is sort of a graphdb implementedusingan RDF store. This would be useful because I *think* that sometimes thesamedata can be queried using SPARQL-like queries and sometimes using amoregraph-y approach like Gremling querying but I'm not sure. (What aboutgraph

pattern matching?!).

[8]https://git.framasoft.org/a-guile-mind/guile-wiredtiger/blob/master/examples/graphdb/graphitisay.md


Choosing wiredtiger and Guile is part of my reasoning to make hyper
and hackable search engine.

Guile provide a nice threading story (aka. no GIL) and writing code
to be eval'ed is much nicer than in non-lisp languages. Ok it sound
like a bad idea but bear with me.

Wiredtiger on the other hand is simply said the best database engine
out there. But why choose a low level component like wiredtiger?

In my reasoning there is the problem that I still don't know which
features I need nor want.

If I start with a RDBMS, I might end up with not efficient code for
doing spell checking or graph traversal. I will need to learn silly
database management command instead of using cp. If I want to make
usable some feature I'd need to choose another database, which would
make the overall solution more complex. If I use a graphdb, I'll inherit
a giant blob of Java (because all free graphdb that are ACID are
Javaesque) which a) might not be as efficient as Guile for hacking
b) might be a good solution for my professional carrier but not so
much for my interests c) I will never have full control over the
database. And if you wonder why I don't choosed to use REDIS as my
primary data storage, you definitely don't know REDIS well enough.

Simply said I choose a database engine because I want my data storage
to be versatile, safe and accessible. I choosed wiredtiger because
it's the best of this kind.

The questions that remains to be answered are:

1) which primary data model: table, tuple or graph oriented?

Actually this is a feature by feature question. And guile-wiredtiger
is written in a way that allows to easily compose database paradigms.

2) how to scale horizontally? how do to multiprocessing?

Now that I think about it again, I remember that one of the
founding grounds of this project is that there will be *no* horizontal
scaling (multiple machines hosting hyper's database) because of gravity
cost (cf. Culture & Empire [*]) this will not be required. This kind of
a philosophical ground.

I am mostly thinking about single host vertical scaling and that is the
most important matter right now since the search engine semantic is very
poor. I'd rather code features and optimize and scale them next.

Still multiprocessing can be interesting to demo the search engine over
a greedy subset of the GNU Guile Internet... Wait... This can wait!



Thanks for your interest.

--
Amirouche ~ amz3 ~ http://www.hyperdev.fr

[Prev in Thread]

Current Thread

[Next in Thread]

Somewhat relational database using wiredtiger (and else), Amirouche Boubekki <=

Prev by Date: Re: How to install guile using local libraries
Next by Date: Re: on bootstrapping: introducing Mes
Previous by thread: How to install guile using local libraries
Next by thread: Re: compiler messages
Index(es):
- Date
- Thread