Riak Cloud Storage. Part 3. Stanchion, Proxy and Load Balancing, S3 Client

image


In this article, we will complete the configuration of the components of the Riak Cloud Storage system.



This article is the completion of a series of free translations of the official manual for the Riak CS 2.1.1 system.

Part 1. Setting up Riak KV

Part 2. Setting up Riak CS component



Setting up Stanchion



You should use one and only one Stanchion node in your cluster. All Riak CS nodes in this cluster must be configured to communicate with the Stanchion node so that the cluster can track and reconcile causal operations.



All settings used by the Stanchion node are contained in the stanchion.conf file, which is located in the / etc / stanchion directory on most operating systems.



If you are upgrading from a version earlier than Riak CS 2.0.0 - when stanchion.conf and riak-cs.conf were introduced - you can still use the old app.config configuration file. The examples below will be equivalent.



STANCHION.CONF



configuration.name = value


APP.CONFIG



{stanchion, [
             %% Configs here
            ]}


Configuring IP address and port for Stanchion



If you only have one Riak CS node, you do not need to change Stanchion settings, because Stanchion is just listening for requests from the localhost. If the Riak CS cluster has many nodes, then you must set the IP address and port on which Stanchion will listen for requests from other nodes.



You can set the IP address using the listener parameter . Replace 127.0.0.1 with the IP address of the Stanchion node and port 8080 with the node port:



STANCHION.CONF



listener = 127.0.0.1:8080


APP.CONFIG



{stanchion, [
             {host, {"127.0.0.1", 8085}},
             %% Other configs
            ]}


Note on matching IP addresses The



IP address you enter here must match the IP address in the stanchion_host parameter in the riak.conf for Riak and riak-cs.conf for Riak CS.


If you want to use SSL, make sure the ssl.certfile and ssl.keyfile parameters are not commented out and configured correctly.



STANCHION.CONF



ssl.certfile = "./etc/cert.pem"
ssl.keyfile = "./etc/key.pem"


APP.CONFIG



{stanchion, [
             {ssl, [
                    {certfile, "./etc/cert.pem"},
                    {keyfile, "./etc/key.pem"}
                   ]},
             %% Other configs
            ]}


Setting up an administrator account



The administrator is created when configuring the Riak CS component. The same permissions must be added for every Stanchion used in the cluster. This is configured in stanchion.conf, which is located in the / etc / stanchion directory. Enter the same admin.key and admin.secret



STANCHION.CONF



admin.key = OUCXMB6I3HOZ6D0GWO2D
admin.secret = a58Mqd3qN-SqCoFIta58Mqd3qN7umE2hnunGag==


APP.CONFIG



{stanchion, [
           %% Admin user credentials
           {admin_key, "OUCXMB6I3HOZ6D0GWO2D"},
           {admin_secret, "a58Mqd3qN-SqCoFIta58Mqd3qN7umE2hnunGag=="},
           %% Other configs
          ]}


Setting up Riak KV information



If you are running a single node for experimentation, or if the Riak KV node is running locally and configured to listen for protocol buffer traffic at 0.0.0.0, then the default setting should be fine.



Otherwise, update the IP address and port for the Riak host in the Stanchion config file.



STANCHION.CONF



riak_host = 127.0.0.1:8087


APP.CONFIG



{stanchion, [
             {riak_host, {"127.0.0.1", 8087}},
             %% Other configs
            ]}


Load Balancing and Proxy for Riak CS



image




If you plan to use Riak CS in industrial use, we strongly recommend that you install a load balancer or proxy, software or hardware, after Riak CS. Also note that you must not directly expose Riak CS to open network interfaces.



Riak CS users report successful use of Riak CS with a load balancer or proxy. Collaborative solutions include proprietary hardware load balancers, cloud-based load balancer options such as Amazon's Elastic Load Balancer, and open source software solutions such as HAProxy and Nginx.



This guide briefly reviews the HAProxy and Nginx source code sharing solution and provides some configuration and operational advice gathered from the Riak user and engineering community.



HAProxy



HAProxy is a fast and reliable solution for load balancing and proxying HTTP and TCP application traffic.



Users report successful use of HAProxy in combination with Riak CS in a number of configurations and scenarios. More information and configuration examples for this section are for advanced users in the Riak CS community, in addition to comments from Riak engineers.



Configuration Example



The following example is the starting point for configuring HAProxy as a load balancer for a Riak CS installation.



Note on Open File



Limits Operating system open file limits must be greater than 256,000 for the following configuration example. Check the documentation for open file limits for detailed setting of the value for different operating systems.


CONFIG



global
    log 127.0.0.1     local0
    log 127.0.0.1     local1 notice
    maxconn           256000
    spread-checks     5
    daemon

defaults
    log               global
    option            dontlognull
    option            redispatch
    option            allbackups
    no option         httpclose
    retries           3
    maxconn           256000
    timeout connect   5000
    timeout client    5000
    timeout server    5000

frontend riak_cs
    bind              10.0.24.100:8080
    # Example bind for SSL termination
    # bind            10.0.24.100:8443 ssl crt /opt/local/haproxy/etc/data.pem
    mode              http
    option            httplog
    capture           request header Host len 64
    acl good_ips      src -f /opt/local/haproxy/etc/gip.lst
    block if          !good_ips
    use_backend       riak_cs_backend if good_ips

backend riak_cs_backend
    mode              http
    balance           roundrobin
    # Ping Riak CS to determine health
    option            httpchk GET /riak-cs/ping
    timeout connect 60s
    timeout http-request 60s
    server riak1 r1s01.example.com:8081 weight 1 maxconn 1024 check
    server riak2 r1s02.example.com:8081 weight 1 maxconn 1024 check
    server riak3 r1s03.example.com:8081 weight 1 maxconn 1024 check
    server riak4 r1s04.example.com:8081 weight 1 maxconn 1024 check
    server riak5 r1s05.example.com:8081 weight 1 maxconn 1024 check


Please note that the above example is considered a starting point and is a work in progress.


You must be careful when applying this configuration and changing it to suit your environment.

A specific configuration detail worth noting from the example is the commented out option for using SSL. HAProxy supports SSL directly since version 1.5. Make sure your HAProxy instance is built with OpenSSL support, you can enable SSL by uncommenting the line and modifying it to suit your environment.



You can find more information in the HAProxy documentation .



Also, pay attention to the Riak CS health check option via the / riak-cs / ping endpoint. This parameter is required to validate each Riak CS host as part of the load balancing round robin method.



Nginx



Several users have reported successfully using an Nginx HTTP server to proxy requests for Riak CS. An example providing access to Riak CS is included here for reference.



Configuration Example



Below is an example of initial configuration for Nginx to act as a front-end proxy for Riak CS.



CONFIG



upstream riak_cs_host {
  server  10.0.1.10:8080;
}

server {
  listen   80;
  server_name  _;
  access_log  /var/log/nginx/riak_cs.access.log;

  location / {
    proxy_set_header Host $http_host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_redirect off;

    proxy_connect_timeout      90;
    proxy_send_timeout         90;
    proxy_read_timeout         90;

    proxy_buffer_size          64k;  # If set to a smaller value,
                                     # nginx can complain with a
                                     # "headers too large" error

    proxy_buffers 8  64k;   # Increase from default of (8, 8k).
                            # If left to default with increased
                            # proxy_buffer_size, nginx complains
                            # that proxy_busy_buffers_size is too
                            # large.

    proxy_pass http://riak_cs_host;
  }
}


Note that the proxy_set_header Host $ http_host directive is required to ensure that the HTTP Host: header is passed to Riak CS as received, and not translated into the hostname or address of the Riak CS backend server.



It is also important to note that proxy_pass should not end with a slash, as this can lead to various problems.



S3 client setup



image




This tutorial shows you how to use s3cmd as an S3 client. While it won't cover all of the client's functionality, it will show you how to create a configuration and run some basic commands.

Note: s3cmd Signature Version



If you are using s3cmd version 1.5.0 or higher you need to add the --signature-v2 flag for each command targeting the Riak CS cluster so that s3cmd will use AWS version 2 and not the default AWS 3 version.

Initial setup



To use s3cmd in conjunction with Riak CS, you must configure the utility to interface with your Riak CS system. One way is to create a .s3cfg file and save it in your home directory. When you run any command related to s3cmd, the contents of the file will be read by default. Alternatively, you can specify a config file using the -c flag . Example:



SHELL



s3cmd -c /PATH/TO/CONFIG/FILE <command>


An alternative way to configure s3cmd is to run s3cmd --configure , which will launch an interactive tool and build a config file based on what you enter.



In the next section, you will find a small example of .s3cfg files that can be used to configure communication with Riak CS.



Sample s3cmd config file for local use



Use this example .s3cfg configuration file to communicate with Riak CS locally on port 8080 with s3cmd (remember to include information specific to your Riak CS installation if necessary).



CONFIG



[default]
access_key = 8QON4KC7BMAYYBCEX5J+
bucket_location = US
cloudfront_host = cloudfront.amazonaws.com
cloudfront_resource = /2010-07-15/distribution
default_mime_type = binary/octet-stream
delete_removed = False
dry_run = False
enable_multipart = False
encoding = UTF-8
encrypt = False
follow_symlinks = False
force = False
get_continue = False
gpg_command = /usr/local/bin/gpg
gpg_decrypt = %(gpg_command)s -d --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s
gpg_encrypt = %(gpg_command)s -c --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s
gpg_passphrase = password
guess_mime_type = True
host_base = s3.amazonaws.com
host_bucket = %(bucket)s.s3.amazonaws.com
human_readable_sizes = False
list_md5 = False
log_target_prefix =
preserve_attrs = True
progress_meter = True
proxy_host = localhost
proxy_port = 8080
recursive = False
recv_chunk = 4096
reduced_redundancy = False
secret_key = rGyDLBi7clBuvrdrkFA6mAJkwJ3ApUVr4Pr9Aw==
send_chunk = 4096
simpledb_host = sdb.amazonaws.com
skip_existing = False
socket_timeout = 300
urlencoding_mode = normal
use_https = False
verbosity = WARNING
signature_v2 = True


An example s3cmd configuration file for use in production



Use this example .s3cfg config file to interact with Riak CS via s3cmd on a production system.



CONFIG



[default]
access_key = EJ8IUJX9X0F2P9HAMIB0
bucket_location = US
cloudfront_host = cloudfront.amazonaws.com
cloudfront_resource = /2010-07-15/distribution
default_mime_type = binary/octet-stream
delete_removed = False
dry_run = False
enable_multipart = False
encoding = UTF-8
encrypt = False
follow_symlinks = False
force = False
get_continue = False
gpg_command = /usr/local/bin/gpg
gpg_decrypt = %(gpg_command)s -d --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s
gpg_encrypt = %(gpg_command)s -c --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s
gpg_passphrase = password
guess_mime_type = True
host_base = <YOUR DOMAIN HERE>
host_bucket = %(bucket)s.<YOUR DOMAIN HERE>
human_readable_sizes = False
list_md5 = False
log_target_prefix =
preserve_attrs = True
progress_meter = True
proxy_host =
proxy_port = 0
recursive = False
recv_chunk = 4096
reduced_redundancy = False
secret_key = XOY/9IFKVEDUl6Allrkj7oyH9XW+CANnFLEVuw==
send_chunk = 4096
simpledb_host = sdb.amazonaws.com
skip_existing = False
socket_timeout = 300
urlencoding_mode = normal
use_https = True
verbosity = WARNING
signature_v2 = True


In order to configure the s3cmd client for a user, you must change the access_key and secret_key.



Configuring storage location



By default, the .3cfg file uses the Amazon S3 service as the storage backend. For a Riak CS system, change the following settings to point to your storage system.



  • host_base - provide a domain name or path to your data store, for example data.example.com
  • host_bucket - Specify the bucket location, for example my_cs_bucket.data.example.com


Using SSL in the client



If you are using SSL then set the use_https parameter to True .



Links



Riak Cloud Storage. Part 1. Configuring Riak KV

Riak Cloud Storage. Part 2. Configuring the Riak CS

Riak Cloud Storage component . Part 3. Stanchion, Proxy and Load Balancing, S3 Client

Original manual.



All Articles