Riak Cloud Storage. Part 2. Configuring the Riak CS component

In this article, we will continue to configure the individual components of the Riak Cloud Storage system, namely the Riak CS component.



image


This article is a continuation of free translations of the official manual for the Riak CS 2.1.1 system.

Part 1. Configuring Riak KV

Part 3. Stanchion, Proxy and Load Balancing, S3 Client

In order to ensure the correct operation of the Riak CS component, it is important to know how to connect to the Riak KV. A Riak CS node usually runs on the same server as its corresponding Riak KV node. This means that changes will only be necessary if the Riak is configured using parameters other than the default.



Riak CS settings are located on the CS node in the riak-cs.conf and advanced.conf configuration files. Both files are usually located in the / etc / riak-cs directory. The new riak-cs.conf file is a simple list with config = option pairs , but there are options that can only be changed via the advanced.config file. It looks something like this:



ADVANCED.CONFIG



{riak_cs, [
    {parameter1, value},
    {parameter2, value},
    %% and so on...
]},


If you are upgrading from a version earlier than 2.0.0 — when the riak-cs.conf file was introduced — you can still use the app.config file located at the location of riak-cs.conf / advanced.config. The app.config file has the same syntax as the advanced.conf file, so any examples that are used for advanced.conf can be directly used in the app.config file.

Please note that the old app.config file replaces the new configuration files. If app.config is present, neither riak-cs.conf nor advanced.config will be used.
Note: about legacy app.config

If you are upgrading from previous versions of Riak CS to Riak CS 2.0 and plan to continue using the inherited app.config file, please note that option names have changed in some configuration files. Also, the IP / Port format was changed in version 2.0 for Stanchion, Riak, Riak CS. You can see the changes in the Rolling Upgrades Document .



For an exhaustive list of available options and a complete list of options for app.config, see the Full Configuration Reference .
The sections below describe important configuration options for the Riak CS.



Host and port.



To connect the Riak CS to the Riak KV, make sure the host and port used by the Riak KV is set:



  • riak_host - replace 127.0.0.1:8087 with the IP address and port number of the Riak KV node to which you want to connect the Riak CS.


You also need to set the listener host for Riak CS:



  • listener - replace 127.0.0.1:8080 with the IP address and port number of the Riak CS host if you intend to use it non-locally. Be sure that the port number does not conflict with the riak_host port of the Riak KV host and the Riak CS host that are running on the same machine.


Note: about the IP address The

IP address you enter here must match the IP address specified for the Riak KV protocol buffers interface in the riak.conf file, unless the Riak CS is running on a completely different network, in which case translation of addresses.
After some changes to riak-cs.conf, restart the Riak CS node if it was already running.



Stanchion Node parameters



If you are using one Riak CS node, then you do not need to change Stanchion settings, because it runs on the local host (Note. Stanchion is installed in only one instance for the entire cluster). If the Riak CS system has multiple nodes, then you must specify the IP address and port of the Stanchion node and whether or not SSL is used.

Parameters for Stanchion are in the riak-cs.conf configuration file of the Riak CS node located in the ./etc/riak-cs/conf directory of each Riak CS node.



To set the host and port for the Stanchion, make sure the following parameter is set for the host and port used by Stanchion:



  • stanchion_host - Replace 127.0.0.1:8085 with the IP address and port of the Stanchion host.


Using SSL



SSL is disabled by default in Stanchion, i.e. parameter stanchion_ssl is set off . If Stanchion is configured to use SSL, change the value to on . The following configuration example sets Stanchion host to localhost, port 8085 (default) and allows SSL to be used.



RIAK-CS.CONF



stanchion_host = 127.0.0.1:8085
stanchion_ssl = on


ADVANCED.CONFIG

{riak_cs, [
    %% Other configs
    {stanchion_host, {"127.0.0.1", 8085}},
    {stanchion_ssl, true},
    %% Other configs
]}


Setting the hostname



You can also define a more convenient Riak CS hostname to help you identify the host being prompted during troubleshooting. This is configured in the riak-cs.conf or vm.args configuration file, which is also located in /etc/riak-cs.conf. Riak CS node name will be set here riak_cs@127.0.0.1:



RIAK-CS.CONF



nodename = riak_cs@127.0.0.1


VM.ARGS



-name riak_cs@127.0.0.1


Change 127.0.0.1 to the IP address or hostname of the server running Riak CS.



Create an administrator account



An administrator is a special authorized user to perform actions such as creating users or getting statistics. An administrator account is no different from another user's account. You must create an administrator account for future use of Riak CS.

Note: about creating an anonymous user.



Before creating an administrator account, you must set the anonymous_user_creation = on parameter in riak-cs.conf (or set {anonymous_user_creation, true} in advanced.config / app.config). You can disable it again when the administrator is created.
To create an administrator account, use a POST HTTP request and the username you want for the administrator account. For example like:



CURL



curl -H 'Content-Type: application/json' \
  -XPOST http://<host>:<port>/riak-cs/user \
  --data '{"email":"admin@example.com", "name":"admin"}'


The JSON response should look like this:



{
  "display_name" : "admin",
  "email" : "admin@example.com",
  "id" : "8d6f05190095117120d4449484f5d87691aa03801cc4914411ab432e6ee0fd6b",
  "key_id" : "OUCXMB6I3HOZ6D0GWO2D",
  "key_secret" : "a58Mqd3qN-SqCoFIta58Mqd3qN7umE2hnunGag==",
  "name" : "admin_example",
  "status" : "enabled"
}


You can optionally send and receive XML by setting the Content-Type to application / xml.



Once the administrator has been created, you must set administrator privileges for each Riak CS node. Administrator privileges are set in the riak-cs.conf configuration file located in the / etc / riak-cs directory. Insert a key_id line between the quotes for admin.key. Paste secret_key parameter field admin.secret:



RIAK-CS.CONF



admin.key = OUCXMB6I3HOZ6D0GWO2D
admin.secret = a58Mqd3qN-SqCoFIta58Mqd3qN7umE2hnunGag==


ADVANCED.CONFIG



{riak_cs, [
           %% Admin user credentials
           {admin_key, "OUCXMB6I3HOZ6D0GWO2D"},
           {admin_secret, "a58Mqd3qN-SqCoFIta58Mqd3qN7umE2hnunGag=="},
           %% Other configs
          ]}


Limiting buckets



You can also set a limit on the number of buckets created by the user. The default is 100 buckets maximum. Please keep in mind that if the user exceeds the bucket creation limit, they will still be available for other operations, including deleting a bucket. You can change the default limit using the max_buckets_per_user parameter on each node in the advanced.config file - and this will not be an equal change in the riak-cs.conf file. For example the configuration below specifies a maximum of 1000:



ADVANCED.CONFIG



{riak_cs, [
           %% Other configs
           {max_buckets_per_user, 1000},
           %% Other configs
          ]}


If you want to remove the restrictions on creating buckets by one user, you can set the value of the max_buckets_per_user parameter to unlimited.



Connection pools



Riak CS uses two explicit connection pools for the purpose of communicating with Riak KV: primary (primary) and secondary pools.



The primary connection pool is used to service most API requests related to loading or retrieving objects. It is specified in the configuration file as pool.request.size . The default pool size is 128. The



secondary connection pool is strictly used for enumeration requests for bucket contents. A separate connection pool is required to improve performance.

The secondary connection pool is defined in the configuration file as pool.list.size . By default, its size is 5.



Below is the connection_pools configuration entrythe default which can be found in the app.config file:



RIAK-CS.CONF



pool.request.size = 128
pool.request.overflow = 0
pool.list.size = 5
pool.list.overflow = 0


ADVANCED.CONFIG



{riak_cs, [
           %% Other configs
           {connection_pools,
           [
            {request_pool, {128, 0} },
            {bucket_list_pool, {5, 0} }
           ]},
           %% Other configs
]}


The value for each pool is split into pairs, where the first number represents the normal size of the pool. This is the number of concurrent requests of a particular type that the Riak CS node can serve. The second number is the number of allowed pool overflow. It is not recommended to use any value other than 0 for overflow here, unless careful testing has shown that the chosen value is useful for the particular case.



Tuning



We strongly recommend that you be careful when setting the value of the pb_backlog parameter in Riak KV. When the Riak CS node is started, each connection pool establishes connections to the Riak KV. This can lead to a thundering herd problem , in which the connections in the pool are believed to be related to the Riak KV, but in reality they were dropped. Due to the TCP RST packet rate limiting (controlled by the net.inet.icmp.icmplim parameter ), some packets may not receive notification until they are used to service the user's request. This manifests itself as {error, disconnected} messages in the Riak CS log files and an error returned to the user.



SSL Connection in Riak-CS



RIAK-CS.CONF



ssl.certfile = "./etc/cert.pem"
ssl.keyfile = "./etc/key.pem"


ADVANCED.CONFIG

{ssl, [
    {certfile, "./etc/cert.pem"},
    {keyfile, "./etc/key.pem"}
   ]},


Replace the text in quotes in the path to your SSL encryption keys. By default, on each node, the cert.pem and key.pem files are located in the / etc directory. You are free to use these keys or your own.



Please note that you must also provide a Certification Authority (CA), i.e. a CA certificate. If you can, then you must use the advanced.config configuration file and specify its location in the cacertfile parameter . Unlike certfile and keyfile, the cacertfile parameter is not commented out. This is where you must add your certificate. An example of such a configuration:



ADVANCED.CONFIG



{ssl, [
       {certfile, "./etc/cert.pem"},
       {keyfile, "./etc/key.pem"},
       {cacertfile, "./etc/cacert.pem"}
      ]},
      %% Other configs


You can find instructions for creating a CA certificate on third-party resources .



Proxy vs. Direct Connection



Riak CS can interact with S3 clients in one of two ways:



  1. proxy configuration - when an S3 client connects to Riak CS as if from Amazon S3, that is, with typical Amazon URLs.
  2. direct connection - requires the S3 client connected to the Riak CS to be configured as an “S3-compatible service”, that is, the Riak CS connection point masquerades as Amazon S3. Examples of such services are Transmit, s3cmd, DragonDisk.


Proxy



To set up the proxy configuration, configure your proxy client as a point at the Riak CS cluster address. Then set up your client with Riak CS credentials.



When Riak CS receives a proxying request, it serves the request itself and responds to the client as if the request went to S3.



On the server side, the root_host parameter in the riak-cs.conf file must be s3.amazonaws.com because all client requests for bucket URLs will target s3.amazonaws.com. This is the default.

Important : One problem with proxy configurations is that many GUI clients only allow one proxy to be configured for all connections. For customers trying to connect to both S3 and Riak CS, this can be problematic.

Direct connection



Configuring a direct connection is done through the cs_root_host parameter in the riak-cs section of the app.config file. The value must be set to the FQDN of your Riak CS entry point, as all bucket URLs will target the FQDN entry point.



You will also need wildcard DNS records for any child of the entry point to resolve it to the endpoint itself. Example:



CONFIG



data.riakcs.net
*.data.riakcs.net


Garbage collector settings



image


The following settings are available to configure the garbage collector in Riak CS. For more information, see the Garbage Collection section .



  • gc.leeway_period — (leeway_seconds advanced.config app.config) — , , , , . 24h(24 )
  • gc.interval (gc_interval advanced.config app.config) — , . 15m (15 ). , . , gc_interval infinity.
  • gc.retry_interval (gc_retry_interval advanced.config app.config) — , , pending_delete . - , , - . 6h (6 ).
  • gc.max_workers (gc.max_workers advanced.config app.config) — , . 2.
  • active_delete_threshold (active_delete_threshold in advanced.config or app.config) - object blocks that are less than the threshold value are synchronously deleted and their manifests are marked as scheduled_delete . The default is 0.


There are some additional settings that can only be set in the advanced.config or app.config configuration files. None of the following settings are available through the riak-cs.conf config file



  • epoch_start — , . , + leeway_seconds. 0 . , , , . . Erlang. , 10, <<«10»>>.
  • initial_gc_delaygc_interval Riak CS. : GC ; GC. , initial_gc_delay.
  • max_scheduled_delete_manifests — ( ), scheduled_delete . , , . , , , leeway_seconds. , , . .
  • gc_batch_size - This parameter represents the size used to paginate the results of a secondary index query. The default is 1000.


Deprecated configuration

At the moment, Riak CS 2.0 still supports setting the pg_paginated_indexes parameter, and it is strongly recommended not to use these settings. The settings will be removed in the next major release.

Other Riak CS settings



For a complete list of configurable Riak CS parameters, you can refer to the configuration reference document .



Links



Riak Cloud Storage. Part 1. Configuring Riak KV

Riak Cloud Storage. Part 2. Configuring the Riak CS

Riak Cloud Storage component . Part 3. Stanchion, Proxy and Load Balancing, S3 Client



Original manual.



All Articles