\noindent This document is under intellectual property of NekoIT, reproduction is allowed in digital format only.\\% License information, replace this with your own license (if any)
\noindent This document is under intellectual property of NekoIT, reproduction and redistribution is allowed in digital format only without any modifications.\\% License information, replace this with your own license (if any)
\noindent\textit{First printing, 2019}% Printing/edition date
As the table \autoref{tab:server_pricing} expresses, buying hardware in 2019 is not as efficient as renting servers if you do not have the ability to host your servers in your own premises.
\subsection{Brand new hardware}
Brand new hardware is generally a real investment for an individual or a new company. It is also a technical choice that can have lifelong consequences on the business as computers are subdivided in families that each have specific features and behaviours.
@ -461,7 +463,7 @@ We want to provide an online storage with the following properties:
\vspace{0.3em}Finally, it must be flexible and adaptable to multiple use-cases.
\vspace{1em}This leads us to the following idea: we are aiming to create a service that can store encrypted data, it must be able to store it in a layout similar to a disk, this way it possesses the same capabilities as a hard drive disk. The key to decrypt the data is stored online but encrypted with the user password. Authentication requires the user to be able to read the password to get a token. It is possible to leave said token disabled and enable it only with a second authentication factor.
\vspace{1em}This leads us to the following idea: we are aiming to create a service that can store encrypted data, it must be able to store it in a layout similar to a disk, this way it possesses the same capabilities as a hard drive disk. The key to decipher the data is stored online but encrypted with the user password. Authentication requires the user to be able to read the password to get a token. It is possible to leave said token disabled and enable it only with a second authentication factor.
We want our system to be protected from the point of view of our customers, as such, we aim at it having a code-base readable and short enough to be explored completely in 3 days by a developer with access to enough documentation.
@ -475,7 +477,7 @@ Our project aim to follow the following principles:
\item Principle of openness: we aim to disclose any incident that may happen, and to disclose any request by officials to access anyone's data.
\end{itemize}
The principle of least knowledge is upheld in the very design of the system: only the user can make sense of the address space of both the file system and block device. To provide an analogy, the data is stored in multiple boxes. The user side software randomly labels the boxes and seal them (that seal is the encryption). If you store data that overflows from one box, you will store in multiple boxes. Decrypting any data requires to know which box is the first one and which is the next one, but that very piece of information is not stored on the server: it is stored in one of the sealed boxes.
The principle of least knowledge is upheld in the very design of the system: only the user can make sense of the address space of both the file system and block device. To provide an analogy, the data is stored in multiple boxes. The user side software randomly labels the boxes and seal them (that seal is the encryption). If you store data that overflows from one box, you will store in multiple boxes. deciphering any data requires to know which box is the first one and which is the next one, but that very piece of information is not stored on the server: it is stored in one of the sealed boxes.
Furthermore, the labels of each block of data can be used as a piece of the encryption process, this is an example of the principle of greater usage: any additional information that can help make the system safer, we will use it.
@ -512,12 +514,53 @@ Here is a list of the data that may be collected by us in any interaction with o
\chapter{A personal view on business practices}
\vspace{-5em}
\begin{center}
{\tiny This chapter expresses the view of the author of both this documentation and the software associated with it and only of him.}
\end{center}
\vspace{5em}
\vspace{-3em}\hspace{-1.4em}On the internet, we encounter a huge variety of businesses and of company cultures. We also encounter a just as huge variety of bad business practices.
As a person with peculiar ways to see good and bad, I will say that is actively doing things that may be hurtful to a customer or have dire consequences to them as bad.
\section{On selling the user}
For some companies, users are the resource that they sell to gain their money. I would like to put down 3 forms of business just after that sentence:
\begin{enumerate}
\item Companies that sell products and services
\item Companies that showcase or regroup other products and services of other companies and sell them
\item Companies that provide a service to someone for free in exchange for their personal information in order to sell said information to other companies
\end{enumerate}
I think it is fairly easy to see how a user could be troubled by the idea that the service they are providing them is a honeypot for advertisement information. I make a difference between that and providing advertisement to user of one's platform in the idea that they are providing information to their users (advertisements) but not selling the information of any individual user of their platform, putting it akin to advertisements on television or on cars.
I see two issues in selling personal information from users \emph{even if they signed a contract to agree you do} and first is that users do not know to who their personal information has been exchanged with; in short, they lose control of their privacy. Coming from a country where keeping video protection records for more than necessary is illegal and where one can forbid companies from using their phone number for any kind of communications, I was raised in the idea you are free to exchange your own personal information but not anyone's else.
This leaves you master of your information at all times.
The second problem is a moral one, and it is double. First, their user is a product, no longer so much of a human. This means that more than respecting them as individuals, you have to get a lot of them to create a significant revenue. Then, it is about how explicit companies that do that kind of model inform their users of the way their model works and on the alternatives to having your data sold they do provide, which are generally nonexistent.
I do understand people would be willing to sell their data for a service free of charges, It should be understandable people would also be willing to have that service and pay for it to be in a situation where their data is not sold to third parties.
\section{On misrepresenting the invisible}
\autocite{apple_sweatshop}
Some companies are proud to showcase their service/products as a first class services/products, whenever it is customer support or products, while it is actually third-world exploitation or exploiting illegal immigrants\autocite{apple_sweatshop}.
No need to present any details on morals implications on lies and of employing low qualification labor for as cheap as it is possible. I will focus on how bad that is for your customers. First of all, under-qualified labor is very likely to provide wrong advice for products or to be very under performing providing a service, not to mention building or assembling any sort of products.
Some also present their products like they are so advanced they are magical while they actually provide very few added value.
\vspace{1em}On this side is also the representation of the mythical beast named the Cloud. People tend to represent it to themselves as their data stored in some imaginary location, as it effectively can be anywhere in the world. Where a company could make sure data about a single customer is in a handful of their data centers so that they could say "your data is in located in our datacenter of \emph{name of the city} in \emph{name of the country}" it would make for a nice improvement.
\section{On practice of transparent business}
I think any level of openness of practices in a business is good. When you show the insides of your business and of how it works, I think it both bears trust and customers to you.
Companies with open business practices have in my opinion the greatest chance of surviving for a long time and to avoid scandal. I think it can be as open as presenting the cost of the products sliced by its components and margin to the customers. This brings a relationship of trust between the business and its customer, making the business more capable of actually changing the price of their products given changes of the cost they incur to make them.
\part{Functional specification}
@ -851,115 +894,13 @@ Behaviour of the login part of the page is expected to be identical to the behav
\part{Technical specification}
\chapter{Storage layer}
\section{\texttt{izaro-storage}}
\section{\texttt{db\_{}stats}}
\chapter{Coordination layer}
\section{Client to \texttt{izaro-coordinate}}
\section{\texttt{izaro-coordinate} to \texttt{izaro-storage}}
\chapter{Time synchronization}
\section{Steadiness requirement}
\section{Storage side requirement}
\chapter{Client side}
\section{Key blocks and system root layout}
\section{Command line user interface}
\section{Graphical user interface}
\chapter{Block storage system}
\chapter{Native file system}
\appendix
\part{Annexes}
%\setcounter{chapter}{1}
%\renewcommand\thechapter{\Alph{chapter}}
\chapter{Encryption popularized}
\label{annex:encryption_popularized}
\textbf{Encryption: }The act to transform a message into a random looking cipher-text. The original message is often named the plain-text.
\hspace{-1.4em}\textbf{Cipher: }The mathematical function that transforms a plain-text into a cipher-text
\vspace{1.5em}\hspace{-1.4em}Encrypting data can be done in various ways. Each way have its properties and its resistances to certain types of attacks. All modern cryptography is key-based cryptography. It means that the way we encrypt data is not secret, what is secret is a value, named the key, that is used to encrypt the data.
\section{Properties of encryption}
We will here explain some of the properties a cipher can hold.
\subsection{Resistance}
A cipher generally have its resistance expressed as a power of two (e.g.: $2^{103}$) or as a number of bits of entropy (e.g.: $103~bits$). It is to be noted that this scale is not linear: it is exponential.
This means that a cipher that have $104~bits$ of entropy is 2 times harder to break than one with a resistance of $103~bits$ of the same family. Comparing resistance between different families is not relevant.
\subsection{Compactness}
Compactness of a cipher means that if you encrypt a message of side $n$ you will obtain a cipher-text of the same size. Conversely, If a cipher can generates a longer cipher-text than its message, it is said to be not compact.
for example, let's consider a simple cipher: for a message $A$, read it as a number and multiply it with a value that will be the key.
If your message is for example 8 digits, like $00005555$ and the key is $12345678$, the cipher-text will be equal to $5555\times12345678=68580241290$ which make a 11 digits cipher-text from a 8 digit message.
\subsection{Homomorphism}
Homomorphism means that for a message $A$ and an operation $f : x$ (for example, if $f : x \rightarrow x \times2$ means the operation of multiplying by 2), if you apply a cipher to $A$ and get a cipher-text $B$, there exist a way to apply $f : x$ to $B$ in such a way that decryption of $B$ gives you the result of applying $f : x$ to $A$.
Expressed more simply, it means the you can execute operations on encrypted data without requiring to decipher it or understand it. Very few encryption mechanisms are fully homomorphic and those are mostly in research\autocite{Gentry:2009:FHE:1834954}.
\section{Types of encryption}
Encryption can express itself in different forms regarding to its way to handle the cryptographic key. Some have only one key, that must be known for encrypting and decrypting the data, we call those symmetrical ciphers; some have two keys, one for encrypting and one for decrypting, we call those asymmetrical ciphers.
\subsection{Symmetrical encryption}
\begin{figure}[h]
\begin{center}
\begin{itemize}
\item AES
\item Chacha20
\item Blowfish
\item Serpent
\item Twofish
\item CAST5
\item RC4
\item DES
\item 3DES
\item Skipjack
\item IDEA
\end{itemize}
\end{center}
\caption{List of symmetrical ciphers}
\label{fig:sym_ciphers}
\end{figure}
\subsection{Asymmetrical encryption}
\begin{figure}[h]
\begin{center}
\begin{itemize}
\item Prime numbers based (RSA)
\item Elliptic curve based (ECDSA)
\item Paillier crypto system
\item Lattice based (NTRU, BLISS\autocite{Gentry:2009:FHE:1834954})
@ -1160,7 +1142,7 @@ Encryption can express itself in different forms regarding to its way to handle
\section{\texttt{izaro-coordinate} queries}
\begin{figure}[h]
\begin{figure}[H]
\centering
\begin{bytefield}[bitwidth=0.48em]{64}
@ -1185,7 +1167,7 @@ Encryption can express itself in different forms regarding to its way to handle
\subsection{User payloads}
\begin{figure}[h]
\begin{figure}[H]
\centering
\begin{bytefield}[bitwidth=0.48em]{64}
@ -1202,7 +1184,7 @@ Encryption can express itself in different forms regarding to its way to handle
\label{fig:read_request}
\end{figure}
\begin{figure}[h]
\begin{figure}[H]
\centering
\begin{bytefield}[bitwidth=0.48em]{64}
@ -1222,12 +1204,13 @@ Encryption can express itself in different forms regarding to its way to handle
\subsection{Root user payloads}
\section{\texttt{izaro-coordinate} replies}
\begin{figure}[h]
\begin{figure}[H]
\centering
\begin{bytefield}[bitwidth=0.48em]{64}
\bitheader{0,31,63}\\
\wordbox{1}{\texttt{page count}}\\
\wordbox{3}{\texttt{page max}}\\
@ -1243,7 +1226,7 @@ Encryption can express itself in different forms regarding to its way to handle
\section{\texttt{izaro-coordinate} timing protocol and consensus}
\begin{figure}[h]
\begin{figure}[H]
\centering
\begin{sequencediagram}
\newthread{a}{: Client}
@ -1256,7 +1239,7 @@ Encryption can express itself in different forms regarding to its way to handle
\label{fig:time_proto}
\end{figure}
\begin{figure}[h]
\begin{figure}[H]
\centering
\begin{sequencediagram}
\newthread{a}{: Client \#1}
@ -1276,7 +1259,7 @@ Encryption can express itself in different forms regarding to its way to handle
\label{fig:confirmation_proto}
\end{figure}
\begin{figure}[h]
\begin{figure}[H]
\centering
\begin{footnotesize}
\begin{sequencediagram}
@ -1322,6 +1305,117 @@ Encryption can express itself in different forms regarding to its way to handle
\label{fig:2user_confirmation_proto}
\end{figure}
\chapter{Storage layer}
\section{\texttt{izaro-storage}}
\section{\texttt{db\_{}stats}}
\chapter{Coordination layer}
\section{Client to \texttt{izaro-coordinate}}
\section{\texttt{izaro-coordinate} to \texttt{izaro-storage}}
\chapter{Time synchronization}
\section{Steadiness requirement}
\section{Storage side requirement}
\chapter{Client side}
\section{Key blocks and system root layout}
\section{Command line user interface}
\section{Graphical user interface}
\chapter{Block storage system}
\chapter{Native file system}
\appendix
\part{Annexes}
%\setcounter{chapter}{1}
%\renewcommand\thechapter{\Alph{chapter}}
\chapter{Encryption popularized}
\label{annex:encryption_popularized}
\textbf{Encryption: }The act to transform a message into a random looking cipher-text. The original message is often named the plain-text.
\hspace{-1.4em}\textbf{Cipher: }The mathematical function that transforms a plain-text into a cipher-text
\vspace{1.5em}\hspace{-1.4em}Encrypting data can be done in various ways. Each way have its properties and its resistances to certain types of attacks. All modern cryptography is key-based cryptography. It means that the way we encrypt data is not secret, what is secret is a value, named the key, that is used to encrypt the data.
\section{Properties of encryption}
We will here explain some of the properties a cipher can hold.
\subsection{Resistance}
A cipher generally have its resistance expressed as a power of two (e.g.: $2^{103}$) or as a number of bits of entropy (e.g.: $103~bits$). It is to be noted that this scale is not linear: it is exponential.
This means that a cipher that have $104~bits$ of entropy is 2 times harder to break than one with a resistance of $103~bits$ of the same family. Comparing resistance between different families is not relevant.
\subsection{Compactness}
Compactness of a cipher means that if you encrypt a message of side $n$ you will obtain a cipher-text of the same size. Conversely, If a cipher can generates a longer cipher-text than its message, it is said to be not compact.
for example, let's consider a simple cipher: for a message $A$, read it as a number and multiply it with a value that will be the key.
If your message is for example 8 digits, like $00005555$ and the key is $12345678$, the cipher-text will be equal to $5555\times12345678=68580241290$ which make a 11 digits cipher-text from a 8 digit message.
\subsection{Homomorphism}
Homomorphism means that for a message $A$ and an operation $f : x$ (for example, if $f : x \rightarrow x \times2$ means the operation of multiplying by 2), if you apply a cipher to $A$ and get a cipher-text $B$, there exist a way to apply $f : x$ to $B$ in such a way that decipherion of $B$ gives you the result of applying $f : x$ to $A$.
Expressed more simply, it means the you can execute operations on encrypted data without requiring to decipher it or understand it. Very few encryption mechanisms are fully homomorphic and those are mostly in research\autocite{Gentry:2009:FHE:1834954}.
\section{Types of encryption}
Encryption can express itself in different forms regarding to its way to handle the cryptographic key. Some have only one key, that must be known for encrypting and deciphering the data, we call those symmetrical ciphers; some have two keys, one for encrypting and one for deciphering, we call those asymmetrical ciphers.
\subsection{Symmetrical encryption}
Symmetrical encryption aims to encrypt data on a two way channel. The key allows you to both write encrypted data and read encrypted data, making it very suitable for securing a network channel once the keys have been safely exchanged, or to encrypt a disk.
\begin{figure}[h]
\begin{center}
\begin{itemize}
\item AES
\item Chacha20
\item Blowfish
\item Serpent
\item Twofish
\item CAST5
\item RC4
\item DES
\item 3DES
\item Skipjack
\item IDEA
\end{itemize}
\end{center}
\caption{List of symmetrical ciphers}
\label{fig:sym_ciphers}
\end{figure}
Symmetrical encryption also have the advantage that in some cases it is possible to interweave some amounts of the encrypted data together, making the data harder to decipher if a part of it is missing. This is especially used when encrypting data that is read sequentially like network data, but this feature is not suitable for encrypting data on a disk or on any random access media.
\subsection{Asymmetrical encryption}
The goal of asymmetrical encryption is to provide ways to authenticate messages, ways to encrypt a message with a key and decipher it with a different key, and more generally ways to exchange secrets.
\begin{figure}[h]
\begin{center}
\begin{itemize}
\item Prime factorization based (RSA)
\item Elliptic curve based (ECDSA)
\item Paillier crypto system
\item Lattice based (NTRU, BLISS\autocite{Gentry:2009:FHE:1834954})
\end{itemize}
\end{center}
\caption{List of asymmetrical ciphers}
\label{fig:asym_ciphers}
\end{figure}
It is evolving a lot nowadays as the most used algorithms are not extremely resistant to being deciphered by quantum mechanics based computers, in particular, systems based on the prime factor decomposition problem are hard to solve by typical computers but in theory not as hard by quantum computers.
This kind of cryptographic systems are generally used to exchange keys for symmetrical encryption.