# Construction of Fractional Repetition Codes with Variable Parameters for Distributed Storage Systems

^{1}

^{2}

^{*}

^{†}

Next Article in Journal

School of Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Korea

Department of Information and Communication Engineering, Chosun University, Gwangju 61452, Korea

Author to whom correspondence should be addressed.

These authors contributed equally to this work.

Academic Editor: Raúl Alcaraz Martínez

Received: 2 November 2016 / Revised: 1 December 2016 / Accepted: 2 December 2016 / Published: 8 December 2016

(This article belongs to the Section Information Theory, Probability and Statistics)

In this paper, we propose a new class of regular fractional repetition (FR) codes constructed from perfect difference families and quasi-perfect difference families to store big data in distributed storage systems. The main advantage of the proposed construction method is that it supports a wide range of code parameter values compared to existing ones, which is an important feature to be adopted in practical systems. When using one instance of the proposed codes for a given parameter set, we show that the amount of stored data is very close to that of an existing state-of-the-art optimal FR code.

As users produce a tremendous amount of data everyday and these big data need to be stored in cloud storage [1,2], the efficiency and reliability of distributed storage systems (DSSs) become more crucial. Rather than using a random protocol, we need to adopt some mathematical structures in storing big data to minimize the overall cost of DSSs. Recently, locally repairable codes (LRCs) [3,4,5,6,7], and regenerating codes (RCs) [8,9,10,11] have been proposed for DSSs to support cost-reducing repair against node failures as well as data reconstruction. For repair, LRCs aim to connect to a small number of other nodes while RCs aim to use a small amount of network bandwidth. There exists a trade-off between storage and repair bandwidth for general RCs and its two extreme points correspond to the minimum storage regenerating (MSR) codes and the minimum bandwidth regenerating (MBR) codes. If every repaired node contains the data which were stored before failure, such RC is said to be exact [12,13].

Fractional repetition (FR) codes are firstly defined in [14]. Data to be stored are fractionalized into several blocks, each block is repeated, and all repeated blocks are distributed to nodes based on a rule. FR codes can be seen as a variant of exact MBR codes and they aim to minimize disk I/O as well as network bandwidth for repair. Various construction methods for FR codes are proposed using graphs, design theory, algebraic structures, and so on [14,15,16,17,18,19]. Note that the existing construction methods result in FR codes with quite restricted code parameter values, which will be discussed in Section 2.

In this paper, we propose a new class of regular FR codes constructed from perfect difference families and quasi-perfect difference families. The proposed codes can support a wide range of code parameter values, especially, any integer-valued code length larger than a certain value. This property is valuable for practical DSSs where each data has an arbitrary size and different importance, so it may need to be repeated a different number of times or distributed along a different number of nodes. Also, we theoretically show that the proposed codes have a good property to store a large size of data. Finally, the proposed FR codes are compared with existing regular FR codes in terms of the constructible code parameters and the actual size of stored data for each number of connected nodes. This comparison shows that the amount of stored data based on one instance of proposed codes is not far from the FR capacity bound [14] and very close to that of an existing state-of-the-art optimal FR code under the same parameter values.

An ${(n,k,d,M,\alpha ,\beta )}_{q}$ RC for $k\le d\le n-1$ and $\beta \le \alpha $ is defined as follows. The number of nodes in DSS is denoted by n and the number of data symbols we need to store in DSS is denoted by M, which is often called file size. The alphabet of symbols is denoted by ${\mathbb{F}}_{q}$ and each node stores α symbols. The parameter k is called reconstruction degree which means that a data collector can reconstruct all data symbols by connecting to any k nodes, downloading α symbols from each node, and decoding them. A failed node is repaired by connecting to any other d node, downloading β symbols from each, and decoding them. With these notations, the repair bandwidth becomes $d\beta $. Without loss of generality, we assume $\beta =1$ for the code construction throughout this paper [12]. For MBR codes, d is equal to α and the file size is given as
for given k, $\alpha (=d)$, and $\beta (=1)$. This value is called MBR capacity.

$$M=k\alpha -\left(\genfrac{}{}{0pt}{}{k}{2}\right),$$

An $(n,\alpha ,\rho )$ regular FR code $\mathcal{C}$ is defined as a collection of n subsets ${N}_{0},{N}_{1},\dots ,{N}_{n-1}$ of $\{0,1,\dots ,\theta -1\}$ such that

- $n\alpha =\rho \theta $;
- $|{N}_{i}|=\alpha $ for $i=0,\dots ,n-1$;
- each symbol of $\{0,1,\dots ,\theta -1\}$ belongs to exactly ρ subsets in $\mathcal{C}$.

According to an FR code $\mathcal{C}$, node i, $i=0,\dots ,n-1$, stores the symbols in ${N}_{i}$. The parameter ρ is called the repetition degree of $\mathcal{C}$. The incidence matrix of $\mathcal{C}$, denoted by $I\left(\mathcal{C}\right)$, is defined by the $n\times \theta $ binary matrix whose $(i,j)$-th element is 1 if set ${N}_{i}$ includes symbol j or 0 otherwise. It is noted that the row and column weights of $I\left(\mathcal{C}\right)$ are α and ρ, respectively. An FR code is said to be irregular when $|{N}_{i}|$ is not constant or the number of appearances of each symbol in the subsets is not constant. An FR code is often used as an inner code together with an outer $(\theta ,M)$ MDS code. This concatenated code is called distributed replication based exact simple storage (DRESS) code [20] and is represented with parameters $\left[\right(\theta ,M),k,(n,\alpha ,\rho \left)\right]$.

FR codes can be seen as a variant of exact MBR codes. For repair, a failed node needs to connect to α nodes and download the $\beta =1$ symbol from each. This means $d=\alpha $. Note that repair can be done without any computation and the failed node gets to store exactly what it stored before failure. For reconstruction, a data collector needs to connect to any k node and download the α symbols from each. Up to now, FR codes have the same properties with MBR codes but the only difference is that the constraint of connecting to “any” d nodes for repair is relaxed to “some” d nodes and the selection of d nodes is usually table-based.

For a given FR code, the file size M is determined as a function of k as the following:
It is addressed in [14] that for a well-constructed FR code, $M\left(k\right)$ can be larger than the MBR capacity (1) and this becomes possible thanks to the relaxation of the constraint. For a given parameter set $(n,k,\alpha ,\rho )$, the FR capacity, denoted by $A(n,k,\alpha ,\rho )$, is defined as the maximum value of $M\left(k\right)$ among all possible FR codes. Capacity-achieving FR codes are constructed for some parameters in [19] but the FR capacity is unknown in general. Instead, an upper bound is derived in [14], shown as follows:

$$M\left(k\right)=\underset{\genfrac{}{}{0pt}{}{I\subset \{0,\dots ,n-1\}}{\left|I\right|=k}}{\mathrm{min}}\left|{\cup}_{i\in I}{N}_{i}\right|$$

$$\begin{array}{c}A(n,k,\alpha ,\rho )\le \varphi \left(k\right),\phantom{\rule{3.33333pt}{0ex}}\mathrm{where}\hfill \\ \varphi \left(1\right)=\alpha ,\phantom{\rule{3.33333pt}{0ex}}\varphi (k+1)=\varphi \left(k\right)+\alpha -\u2308\frac{\rho \varphi \left(k\right)-k\alpha}{n-k}\u2309\hfill \end{array}$$

In Section 3, we will propose new classes of regular FR codes. It is known that regular FR codes are good for load balancing [15] while irregular FR codes are suitable for heterogeneous networks [21]. Though various irregular FR codes have been proposed, for example, in [17,18,20,21], we focus on introducing existing regular FR codes related to our proposed construction. Also, we consider only the regular FR codes for $3\le \rho \le \alpha $ and $\beta =1$. It is noted that $\rho =2$ could not be enough for practical systems to have good reliability in general and $\rho \le \alpha $ is a common scenario in DSSs [15].

In [14], Steiner systems are used as a class of regular FR codes and a parameter set $(v,(v-1)/(k-1),k)$ is constructible when a Steiner system $S(2,k,v)$ exists. In [15], $(({q}^{n+2}-1)/(q-1),({q}^{n+1}-1)/(q-1),q+1)$ regular FR codes are constructed for a prime power q and $n\ge 1$, which are based on the projective geometry and Latin squares. In fact, these FR codes are a subclass of the FR codes from Steiner systems in [14] and they additionally have a scalable property. In [16], various regular FR codes are constructed. FR codes constructed from mutually orthogonal Latin squares have parameters $(\rho {p}^{m},{p}^{m},\rho \le {p}^{m}-1)$ and $(4a,a,4)$ for a prime p and positive integers m and an integer $a\ne 2,6$. For $\beta ={q}^{m-2}$, $(q\rho ,{q}^{m-1},1\le \rho \le ({q}^{m}-1)/(q-1))$ FR codes are constructed from affine resolvable designs and, for $\beta =a$, $(8a-2,2a,4a-1)$ FR codes are constructed from Hadamard designs, where q is a prime power and $4a-1\ge 7$ is an odd prime power. Lastly, in [19], $(\rho \alpha ,\alpha ,\rho )$ regular FR codes are constructed from transversal designs for $3\le \rho \le \alpha +1$. Also, $\left(\right(s+1\left)\right(st+1),t+1,s+1)$ regular FR codes are constructed from generalized quadrangles for $2\le s\le t$. These two classes of FR codes are optimal for selected parameters satisfying the conditions in [19]. In addition, a lower bound on the reconstruction degree is derived for all the other given parameters and some constructions which attain this bound are presented. Also, FR batch codes are constructed based on bipartite complete graphs, graphs with large girth, transversal designs, and affine planes to additionally satisfy the property that any set of symbols of a prescribed size can be retrieved by downloading, at most, one symbol from each node. It is noted that this property is not considered in this paper.

In this subsection, definitions and terms are given mostly based on [22]. A cyclic difference family is defined as follows.

Let ${\mathbb{Z}}_{v}$ denote the set of integers from 0 to $v-1$. Then tκ-element subsets of ${\mathbb{Z}}_{v}$, ${B}_{i}=\{{b}_{i0},{b}_{i1},\dots ,{b}_{i(\kappa -1)}\}$, $i=0,1,\dots ,t-1$, form a $(v,\kappa ,\lambda )$ cyclic difference family (CDF) if every nonzero element in ${\mathbb{Z}}_{v}$ occurs λ times among the differences $({b}_{il}-{b}_{im})\phantom{\rule{0.277778em}{0ex}}\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}v$, $0\le i\le t-1$ and $0\le l\ne m\le \kappa -1$.

In this paper, special classes of CDFs, called perfect difference family and quasi-perfect difference family, will be used to construct FR codes.

Consider tκ-element subsets of ${\mathbb{Z}}_{v}$ ${B}_{i}=\{{b}_{i0},{b}_{i1},\dots ,{b}_{i(\kappa -1)}\}$, $i=0,1,\dots ,t-1$, ${b}_{i0}<{b}_{i1}<\cdots <{b}_{i(\kappa -1)}$. Among the $t\kappa (\kappa -1)$ differences $({b}_{il}-{b}_{im})\phantom{\rule{0.277778em}{0ex}}\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}v$, $0\le i\le t-1$ and $0\le l\ne m\le \kappa -1$, the half of them ${b}_{il}-{b}_{im}$, $i=0,1,\dots ,t-1$, $0\le m<l\le \kappa -1$, are called the positive differences of ${B}_{i}$’s over ${\mathbb{Z}}_{v}$ and the other half of differences are called the negative differences of ${B}_{i}$’s over ${\mathbb{Z}}_{v}$.

Consider a $(v,\kappa ,1)$ CDF, ${B}_{i}=\{{b}_{i0},{b}_{i1},\dots ,$ ${b}_{i(\kappa -1)}\}$, $i=0,1,\dots ,t-1$, ${b}_{i0}<{b}_{i1}<\cdots <{b}_{i(\kappa -1)}$, with $v=\kappa (\kappa -1)t+1$. Then the CDF is called a $(v,\kappa ,1)$ perfect difference family (PDF) if the $t\kappa (\kappa -1)/2$ positive differences exactly cover the set $\{1,2,\dots ,(v-1)/2\}$ and called a $(v,\kappa ,1)$ quasi-perfect difference family (QPDF) if they exactly cover the set $\{1,2,\dots ,(v-3)/2\}\cup \left\{\right(v+1)/2\}$.

Some recent results on the existence of PDFs and QPDFs are summarized in the following theorem:

The existence of $\left(\kappa \right(\kappa -1)t+1,\kappa ,1)$ (Q)PDFs is given as:

- (1)
- A $(6t+1,3,1)$ PDF exists for $t\equiv 0\phantom{\rule{3.33333pt}{0ex}}\mathrm{or}\phantom{\rule{3.33333pt}{0ex}}1\phantom{\rule{0.277778em}{0ex}}\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}4$.
- (2)
- A $(6t+1,3,1)$ QPDF exists for $t\equiv 2\phantom{\rule{3.33333pt}{0ex}}\mathrm{or}\phantom{\rule{3.33333pt}{0ex}}3\phantom{\rule{0.277778em}{0ex}}\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}4$.
- (3)
- A $(12t+1,4,1)$ PDF exists for $t=1$ or $4\le t\le 1000$.
- (4)
- $(20t+1,5,1)$ PDFs are known for $t=6,8,10$ but for no other values in $1\le t\le 50$.
- (5)
- There is no $\left(\kappa \right(\kappa -1)t+1,\kappa ,1)$ PDF for $\kappa \ge 6$.

Let ${b}_{i,\mathrm{min}}={\mathrm{min}}_{\kappa}{b}_{i\kappa}$ for ${b}_{i\kappa}\in {B}_{i}$, $i=0,1,\dots ,t-1$, in a CDF. Modifying the CDF to ${B}_{i}^{\prime}=\{{b}_{i0}-{b}_{i,\mathrm{min}},{b}_{i1}-{b}_{i,\mathrm{min}},\dots ,{b}_{i(\kappa -1)}-{b}_{i,\mathrm{min}}\}$ is called normalization and the normalization does not change the property of the CDF. Using ordering together with the normalization, any CDF can be represented as ${B}_{i}=\{{b}_{i0}=0,{b}_{i1},\dots ,{b}_{i(\kappa -1)}\}$, $0<{b}_{i1}<\cdots <{b}_{i(\kappa -1)}$, $i=0,1,\dots ,t-1$, which will be assumed as the default throughout this paper.

For $\kappa =3$ and $1\le t\le 5$, we provide specific examples for $(6t+1,3,1)$ (Q)PDFs as below and for the other cases, explicit construction methods are found in [22].

- $t=1$; $\{0,1,3\}$
- $t=2$; $\{0,1,4\},\{0,2,7\}$
- $t=3$; $\{0,1,6\},\{0,2,10\},\{0,3,7\}$
- $t=4$; $\{0,1,6\},\{0,2,11\},\{0,3,10\},\{0,4,12\}$
- $t=5$; $\{0,1,14\},\{0,2,8\},\{0,3,12\},\{0,4,11\},\{0,5,15\}$

We propose a construction method of $(n,\alpha ,\rho )$ FR codes based on (Q)PDFs as below:

- 1.
- Choose n, α, and ρ so that they satisfy
- (i)
- $3\le \rho \le 5$,
- (ii)
- $\alpha =t\rho $, where t is an integer,
- (iii)
- ${t}^{\prime}\ge t$ is an integer which satisfies the condition for the existence of $(\rho (\rho -1){t}^{\prime}+1,\rho ,1)$ (Q)PDF shown in Theorem 1,
- (iv)
- $n\ge \rho (\rho -1){t}^{\prime}+1$ if $\left(\rho \right(\rho -1)t+1,\rho ,1)$ PDF exists, or $n\ge \rho (\rho -1)t+1,\phantom{\rule{3.33333pt}{0ex}}n\ne \rho (\rho -1)t+2$ if $\left(\rho \right(\rho -1)t+1,\rho ,1)$ QPDF exists.

- 2.
- Construct a $(\rho (\rho -1){t}^{\prime}+1,\rho ,1)$ (Q)PDF ${B}_{i}^{\prime}=\{0,{b}_{i1}^{\prime},\dots ,{b}_{i(\rho -1)}^{\prime}\}$, $i=0,1,\dots ,{t}^{\prime}-1$.
- 3.
- Select t subsets ${B}_{i}=\{0,{b}_{i1},\dots ,{b}_{i(\rho -1)}\}$, $i=0,1,\dots ,$ $t-1$ from the (Q)PDF of ${t}^{\prime}$ subsets.
- 4.
- Construct an $(n,\alpha ,\rho )$ FR code such that the set of locations of 1’s in the j-th row of the incidence matrix is $\{({b}_{il}+j)\phantom{\rule{4.44443pt}{0ex}}\left(\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}n\right)+ni\}$, $i=0,1,\dots ,$ $t-1$, $j=0,1,\dots ,n-1$, $l=0,1,\dots ,\rho -1$.

The incidence matrix has the size of $n\times tn$ and, actually, each ${B}_{i}$ forms an $n\times n$ circulant matrix and the incidence matrix is a concatenation of t circulant matrices. A choice of ρ, t, ${t}^{\prime}$, and n fully determines the resulting FR code. When $t={t}^{\prime}$, all subsets of $(\rho (\rho -1){t}^{\prime}+1,\rho ,1)$ (Q)PDF are used to construct the proposed FR codes but we can still flexibly select n as long as it is not less than $\rho (\rho -1)t+1$. It is noted that for the choice of $t={t}^{\prime}$ and $n=\rho (\rho -1){t}^{\prime}+1$, the incidence matrix becomes that of the Steiner system $S(2,\rho ,\rho (\rho -1){t}^{\prime}+1)$ but for every $n>\rho (\rho -1){t}^{\prime}+1$, the incidence matrix is different from that of the Steiner system and it still guarantees that any two nodes do not contain a pair of symbols in common.

When $t<{t}^{\prime}$, only t out of ${t}^{\prime}$ subsets in $(\rho (\rho -1){t}^{\prime}+1,\rho ,1)$ (Q)PDF are used to construct the proposed FR code. This is equivalently explained as the resulting incidence matrix is constructed by removing some circulant matrices from the incidence matrix in the case of $t={t}^{\prime}$. For example, assume that we choose $\rho =3$, $t=2$, ${t}^{\prime}=3$, and $n=19$. Then, we have the $(19,3,1)$ QPDF ${B}_{0}^{\prime}=\{0,1,6\}$, ${B}_{1}^{\prime}=\{0,2,10\}$, ${B}_{2}^{\prime}=\{0,3,7\}$ given in Section 2.3. If we choose the first and the last subsets, ${B}_{0}=\{0,1,6\}$ and ${B}_{1}=\{0,3,7\}$ are used to construct the proposed $(19,6,3)$ FR code whose incidence matrix is shown in Figure 1.

With the proposed construction, we can achieve a wide range of code parameters, especially in terms of n, and this is a main contribution of this work and differentiates from the existing constructions of FR codes, which will be shown in Section 3.3. It is noted that the proposed FR codes are scalable for the number of stored data symbols because we can easily add or remove some symbols corresponding to a subset. However, they are not good for the scalability of the number of storage nodes because adding or removing a node does not sustain the structure of circulant matrices any more.

Equation (2) implies that any k nodes need to contain as few as possible common symbols to have a large file size M for a given n, α, and ρ. Thus, allowing any two nodes to contain, at most, one common symbol can be a good strategy to make an $(n,\alpha ,\rho )$ FR code achieve a large M and the next theorem shows that the proposed FR codes satisfy this property.

Any two nodes contain, at most, one common symbol when using the proposed FR codes.

Since an incidence matrix of FR codes with $t<{t}^{\prime}$ can be regarded as being constructed by removing some circulant matrices from the one with $t={t}^{\prime}$, we need to prove this property for only $t={t}^{\prime}$. For a set S, we first define ${D}_{n}\left(S\right)$ as the multiset of cardinality $\left|S\right|\left(\right|S|-1)$ such that $(s-{s}^{\prime})\phantom{\rule{0.277778em}{0ex}}\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}n\in {D}_{n}\left(S\right)$ for all $s,{s}^{\prime}\in S$, $s\ne {s}^{\prime}$.

Assume that we have a (Q)PDF ${B}_{i}=\{{b}_{i0}=0,{b}_{i1},\dots ,$ ${b}_{i(\rho -1)}\}$, $i=0,1,\dots ,t-1$, and the corresponding $n\times tn$ incidence matrix. For nodes l and m, $0\le l<m\le n-1$, node l has symbols $({b}_{ij}+l)\phantom{\rule{0.277778em}{0ex}}\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}n+ni$ and node m has symbols $({b}_{i{j}^{\prime}}+m)\phantom{\rule{0.277778em}{0ex}}\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}n+ni$ for $j,{j}^{\prime}=0,\dots ,\rho -1$. Nodes l and m have a common symbol in the i-th circulant matrix if $({b}_{ij}+l)\phantom{\rule{0.277778em}{0ex}}\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}n=({b}_{i{j}^{\prime}}+m)\phantom{\rule{0.277778em}{0ex}}\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}n$ for some j and ${j}^{\prime}$. This condition is equivalent to $({b}_{ij}-{b}_{i{j}^{\prime}})\phantom{\rule{0.277778em}{0ex}}\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}n=m-l$, which is simply $m-l\in {D}_{n}\left({B}_{i}\right)$. Thus, we need to check if $m-l$ appears, at most, once in ${D}_{n}\left({B}_{i}\right)$, $i=0,\dots ,t-1$, to prove this theorem.

According to the definition of (Q)PDF, the set of positive differences of ${B}_{i}$’s over ${\mathbb{Z}}_{n}$ is $\{1,2,\dots ,\rho (\rho -1)t/2\}$ ($\{1,2,\dots ,\rho (\rho -1)t/2-1,\rho (\rho -1)t/2+1\}$) and it does not depend on the value of n. The set of negative differences of ${B}_{i}$’s over ${\mathbb{Z}}_{n}$ becomes $\{n-\rho (\rho -1)t/2,\dots ,n-1\}$ ($\{n-\rho (\rho -1)t/2-1,n-\rho (\rho -1)t/2+1,\dots ,n-1\}$) and we can see that there is no overlap between the positive and negative differences by the choice of parameters in Step 1 of Construction. Thus, any $m-l$ ($1\le m-l\le n-1$) appears, at most, once among ${D}_{n}\left({B}_{0}\right)\cup {D}_{n}\left({B}_{1}\right)\cup \cdots \cup {D}_{n}\left({B}_{t-1}\right)$. ☐

First, in Figure 2, we list all constructible $5\le n\le 50$ and $3\le \alpha \le 16$ of the proposed FR codes and the existing regular FR codes with $\beta =1$ for $\rho =3$ which is the most common repetition degree in DSSs. In the legend, PROP, SS, MOLS, TD, and GQ represent the proposed FR codes, the FR codes constructed from Steiner systems, mutually orthogonal Latin squares, transversal designs, and generalized quadrangles, respectively. Our construction gives abundant choices of n for α equal to a multiple of 3 while other constructions give, at most, one value of n for each α.

To compare the file size of the proposed and existing FR codes, we need to select proper parameter values for which some of these codes can be constructed in common. In Figure 2, we can see that every point of PROP with $n=2\alpha +1$ always overlaps a point of SS; every point of MOLS with a prime power α also overlaps a point of TD. Actually, they are equivalent to each other for the parameter values, which is well-known in design theory [22]. Except for too small or too large parameter values, the parameters $(n,\alpha ,\rho )=(33,6,3),(27,9,3)$ seem proper for comparison but the generalized quadrangle for $(33,6,3)$ is not known [22]. Thus, for $(27,9,3)$, we compare, via numerical analysis, the file size of PROP, TD, and the upper bound of FR capacity in (3).

In Figure 3, PROP1 and PROP2 represent the proposed $(27,9,3)$ FR codes with the choice of $t={t}^{\prime}=3$ and the choice of $t=3$ and ${t}^{\prime}=4$, respectively, and the FR bound represents the upper bound in (3). We can see that PROP1 has a slightly smaller file size than the others but PROP2 has almost the same file size with TD. It is noted that TD is known as an optimal code for some parameters but in this case it may not be optimal because it follows the FR bound only up to $k=5<\alpha $.

In this paper, we propose a construction method of regular FR codes based on (Q)PDFs. The resulting FR codes can have various code parameter values compared to other existing regular FR codes, which is an important feature for practical system design. One instance of them, for some parameters, is shown to have a file size close to the FR codes constructed from transversal designs via numerical analysis. Consequently, we can say that the proposed construction enables a flexible choice of code parameter values by slightly sacrificing the file size compared to the state-of-the-art FR codes.

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (NRF-2014R1A2A2A01006870) and this research was also supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2015R1D1A1A01060941).

Hosung Park and Young-Sik Kim devised the proposed construction; Hosung Park performed the experiments; Young-Sik Kim verified the data; Hosung Park and Young-Sik Kim analyzed the result; Hosung Park wrote the paper. All the authors have read and approved the final manuscript.

The authors declare no conflict of interest.

- Kim, S.-H.; Lee, I.-Y. Block access token renewal scheme based on secret sharing in Apache Hadoop. Entropy
**2014**, 16, 4185–4198. [Google Scholar] [CrossRef] - Tamura, Y.; Yamada, S. Reliability analysis based on a jump diffusion model with two Wiener processes for cloud computing with big data. Entropy
**2015**, 17, 4533–4546. [Google Scholar] [CrossRef] - Gopalan, P.; Huang, C.; Simitci, H.; Yekhanin, S. On the locality of codeword symbols. IEEE Trans. Inf. Theory
**2012**, 58, 6925–6934. [Google Scholar] [CrossRef] - Papailiopoulos, D.S.; Dimakis, A.G. Locally repairable codes. IEEE Trans. Inf. Theory
**2014**, 60, 5843–5855. [Google Scholar] [CrossRef] - Song, W.; Dau, S.H.; Yuen, C.; Li, J. Optimal locally repairable linear codes. IEEE J. Sel. Areas Commun.
**2014**, 32, 1019–1036. [Google Scholar] [CrossRef] - Song, W.; Yuen, C. Binary locally repairable codes—Sequential repair for multiple erasures. In Proceedings of the 2016 IEEE Global Communications Conference, Washington, DC, USA, 4–8 December 2016.
- Dau, S.H.; Kiah, H.M.; Song, W.; Yuen, C. Locally encodable and decodable codes for distributed storage systems. In Proceedings of the 2015 IEEE Global Communications Conference, San Diego, CA, USA, 6–10 December 2015.
- Dimakis, A.G.; Godfrey, P.B.; Wainwright, M.J.; Ramchandran, K. Network coding for distributed storage systems. In Proceedings of the 2007 IEEE International Conference on Computer Communications, Anchorage, AK, USA, 6–12 May 2007; pp. 2000–2008.
- Wu, Y.; Dimakis, A.G.; Ramchandran, K. Deterministic regenerating codes for distributed storage. In Proceedings of the 2007 Annual Allerton Conference on Communication, Control, and Computing, Monticello, IL, USA, 28–30 September 2007.
- Van, V.T.; Yuen, C.; Li, J. Non-homogeneous distributed storage systems. In Proceedings of the Annual Allerton Conference on Communication, Control, and Computing, Monticello, IL, USA, 1–5 October 2012; pp. 1133–1140.
- Pernas, J.; Yuen, C.; Gastón, B.; Pujol, J. Non-homogeneous two-rack model for distributed storage systems. In Proceedings of the 2013 IEEE International Symposium on Information Theory, Istanbul, Turkey, 7–12 July 2013; pp. 1237–1241.
- Rashmi, K.V.; Shah, N.B.; Kumar, P.V. Optimal exact-regenerating codes for distributed storage at the MSR and MBR points via a product-matrix construction. IEEE Trans. Inf. Theory
**2011**, 57, 5227–5239. [Google Scholar] [CrossRef] - Tian, C. Characterizing the rate region of the (4, 3, 3) exact-repair regenerating codes. IEEE J. Sel. Areas Commun.
**2014**, 32, 967–975. [Google Scholar] [CrossRef] - Rouayheb, S.E.; Ramchandran, K. Fractional repetition codes for repair in distributed storage systems. In Proceedings of the Annual Allerton Conference on Communication, Control, and Computing, Urbana-Champaign, IL, USA, 29 September–1 October 2010; pp. 1510–1517.
- Koo, J.C.; Gill, J.T., III. Scalable constructions of fractional repetition codes in distributed storage systems. In Proceedings of the Annual Allerton Conference on Communication, Control, and Computing, Urbana-Champaign, IL, USA, 28–30 September 2011; pp. 1366–1373.
- Olmez, O.; Ramamoorthy, A. Fractional repetition codes with flexible repair from combinatorial designs. IEEE Trans. Inf. Theory
**2016**, 62, 1565–1591. [Google Scholar] [CrossRef] - Zhu, B.; Shum, K.W.; Li, H.; Hou, H. General fractional repetition codes for distributed storage systems. IEEE Commun. Lett.
**2014**, 18, 660–663. [Google Scholar] [CrossRef] - Zhu, B.; Shum, K.W.; Li, H. Heterogeneity-aware codes with uncoded repair for distributed storage systems. IEEE Commun. Lett.
**2015**, 19, 901–904. [Google Scholar] [CrossRef] - Silberstein, N.; Etzion, T. Optimal fractional repetition codes based on graphs and designs. IEEE Trans. Inf. Theory
**2015**, 61, 4164–4180. [Google Scholar] [CrossRef] - Pawar, S.; Noorshams, N.; Rouayheb, S.E.; Ramchandran, K. DRESS codes for the storage cloud: Simple randomized constructions. In Proceedings of the 2011 IEEE International Symposium on Information Theory, St. Petersburg, Russia, 31 July–5 August 2011; pp. 2338–2342.
- Yu, Q.; Sung, C.W.; Chan, T.H. Irregular fractional repetition code optimization for heterogeneous cloud storage. IEEE J. Sel. Areas Commun.
**2014**, 32, 1048–1060. [Google Scholar] [CrossRef] - Colbourn, C.J.; Dinitz, J.H. Handbook of Combinatorial Designs, 2nd ed.; CRC Press: Boca Raton, FL, USA, 2007. [Google Scholar]

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).