IIIIIIIIIIIIIIIIIIIIIIIIIIII

Today, August 1, 2014 ツキノワ株式会社 joined. CTO ( ざつよう ) I want to do my best as.

Ring around the moon Ltd. 時雨堂 is a subsidiary.

We are accepting the following jobs at Tsukinoisha Corporation. The area differs from the time stamp of the parent company.

  • Providing services using MQTT, consultant on MQTT
  • Service operation / maintenance improvement / automation (Ansible)
  • Document creation (Sphinx)
  • Web application development (Python, golang)
  • Application development for smartphones such as iOS and Android
  • Web site design creation

At Tsukinohwa, we are currently recruiting highly acclaimed work. Thank you everyone.

Introduction Ansible has been published

This time " 入門 Ansible a", we published in the Amazon.

Until now, there was no Japanese book on Ansible, but as Ansible's introductory position it was written aiming at understanding by reading this first.

I would like to thank the people who reviewed it in writing this book.

The contents are slightly long, but it is as follows.

  • Introduction
    • Characteristics of Ansible
    • Ansible is simple
    • Difference from Chef and Puppet
    • Ansible is "Better Shell Script"
  • Let's use Ansible
    • Installation
    • Inventory file
    • Module
  • Let's make a playbook
    • YAML grammar
    • I will write a playbook
    • Explanation of playbook
    • Task
    • Handler
    • Frequently used modules
  • Let's make complicated playbook
    • Repeat - with_items
    • Save the output and use it later - register
    • Conditional execution - when
    • Repeat until it succeeds - until
    • External information reference - lookup
    • Process variables - filter
    • Enter from the keyboard - vars_prompt
    • Run on management host - local_action
    • Change the module to be executed with a variable - action
    • Set environment variable - environment
    • Ignore it even if it fails - ignore_errors
    • Execute task asynchronously - async
  • Let's build a large playbook
    • Load another playbook - include
    • Recommended Directory Configuration - Best Practices
    • Collectively reuse - role
    • Parallel execution - fork
    • Run sequentially - serial
    • Cooperation with AWS EC 2
    • Dynamically create list of hosts - dynamic inventory
  • Let's use command line options
    • Ssh authentication
    • Limit target host - limit
    • Restrict the task to be executed - tag
    • Run dry - run - check
    • Run while confirming task - step
    • Difference display - diff
  • Encrypt variable file - ansible-vault
    • How to use ansible-vault
    • How to use encrypted files
  • Let's use the published role - Ansible Galaxy
    • What is Ansible Galaxy
    • How to search for role
    • Get the role
    • How to use role
  • FAQ
    • can not connect
    • ControlPath too long error
    • Even if it runs it stops halfway
    • I want to connect without inventory file
    • One playbook got complicated
    • I get an error called python not found
    • I want to use it with Windows
    • A strange picture appears in ansible-playbook
    • I want to know the variables collected by ansible
    • Invalid type <type 'list'>
    • "Syntax Error" appears when using variable
    • What does "---" mean?
    • What is the origin of the name Ansible?
  • in conclusion
  • Appendix : Writing your own module
    • Module behavior
    • Module type
    • Sample module
    • Useful functions in Python
    • Debugging modules
  • Appendix : Ansible Plugin
    • Types of plugins
    • Lookup plugin
    • Filter plugin
    • Callback plugin
    • Action plugin
    • Connection type plugin
    • Vars plugin
  • Appendix : Ansible config file
    • Default section
    • Paramiko section
    • Ssh_connection section
    • Accelerate section

Since it is "introductory", I introduce Ansible introduction, installation, commentary of YAML, etc. from the stage before beginning writing Ansible playbook. In addition, we make it possible to describe the basic playbook first, and then introduce complicated functions afterwards.

Saying that, as well as the details of the configuration file and the creation of modules and plugins are listed, I think that it will be satisfied for non-beginners as well.

sample

I'd like to see a glimpse of what kind of things ", so I will publish chapters 1 and 2 as samples.

入門Ansibleサンプル

Please note that this is PDF, so format is different from what you purchased on Amazon.

Issue Tracker

https://bitbucket.org/r_rudi/ansible-book/issues

We are publishing an issue tracker. When I find misspellings, mistakes, if you like, if you can register here I will fix it. Also, if there is a request to "Please explain this part more," I would like to reflect as much as possible if you can register.

Future plans

Three months later, we will correct the issues accepted by issue tracker, and we will also publish PDF and epub version at GumRoad. Of course, we plan to sell what we have continued to modify on Amazon.

Incidentally, there is no plan for paper making at the moment so far.

Other

  • I wrote completely with Sphinx only. As for this, it is separately.
  • We are looking for a person who can draw a picture of the cover ...

Influx db performance evaluation

I tried influxdb and tried it.

Actually, the interval between 1 test and cluster test is about 1 month apart. Moreover, it is not a neat performance evaluation, so please make it a reference level.

1 unit test

  • DigitalOcean 1GB 30GB (SSD)

I tried it with 512 MB at the beginning, but since I was killed by OOM Killer, I made it 1 GB.

Registration

I sent the following data with HTTP POST from another node of 10 million lines (6 GB).

The batch size is 30 and we decided to register 30 pieces at the same time.

{"int": -74, "str": "ɊƭŷćҏŃȅƒŕƘȉƒŜőȈŃɊҏŷ","uint":3440,"time":1386688205}

By the way, the character string is the unicode character generated by rotunicode. It generates 100 characters.

The result of dstat at that time was this. The load side is completely defeated.

----total-cpu-usage---- -dsk/total- -net/total- ---paging-----system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
 52  19  27   0   2   0|   0  1508k|2313k  140k|   0     0 |5180  10k
 40  18  40   0   1   0|   0     0 |1853k  112k|   0     0 |4922  9888
 41  19  36   2   1   0|   0  2740k|1894k  113k|   0     0 |4928  9944
 46  18  34   1   1   0|   0  1504k|2009k  121k|   0     0 |4752  9516
 42  19  38   0   1   0|   0     0 |1830k  110k|   0     0 |5050  10k
 44  20  34   0   2   0|   0     0 |2022k  121k|   0     0 |5536  11k

I thought it would be good as I thought it would have become like this.

88   8   0   3   1   0|   0  6124k|4806k  131k|   0     0 |2655  4280
87   8   0   3   2   0|   0  7232k|4785k  129k|   0     0 |2185  3364
54  11   0  34   1   0|   0  2136k|4784k  129k|   0     0 |4640  8752

It took like this for such a long time.

real    56m35.234s
user    16m19.658s

At this point, the DB was 2.7 GB.

To say that 10 million rows is 56 minutes means simply dividing 2800 lines per second. However, since it is actually sending 30 lines at once, it may be 93 qps.

As you can see from the result of dstat above, CPU seems to be rate limiting. Because Digital Ocean is an SSD, I / O was not a problem, so I / O may be a problem if it is an ordinary HDD.

Query

It is total time of 10 times in each.

Select * from stress limit 1000
Real 0.417 sec
Select max (int) from stress
Real 15m 50.002s
Select max (int) from stress where time <'2013 - 12 - 12'
236400 records real 0 m 10.699 s

Select count (int) from stress where '2013-12-10 23: 00: 00' <time and time <'2013-12-11 00: 00: 00'

7202 record real 0 m 0. 454 s

By the way, since the CPU at the time of max was 99%, I think that the CPU is a bottleneck.

Cluster system

Speaking of influx db is cluster. Let's build a cluster this time.

I did not write the way of influxdb 's clustering in the document, but I could make it by setting the following setting to config.toml.

  • The first one comments out seed-servers
  • The second one is seed-servers = ["(first address): 8090"]

was. When cluster is assembled, it appears in "cluster" of WebUI.

I do not know whether to add the second address at the third machine, but I think that it is probable that you can add it. The point is that only the first one is special.

This area, this メール was found finally in.

result

I threw it under the same conditions as before. In addition, only one POST destination.

real    35m25.989s
user    14m33.846s

I will not go halfway, but it got fairly quick. It is 4700 lines per second, it feels like 156 qps.

At this point the disk capacity was both 1.4 GB. In other words, it will be divided almost equally.

In addition, 18 shards were made.

Query

Select * from stress limit 1000;
Real 0 m 3.746 s
Select max (int) from stress where time <'2013 - 12 - 12'
236400 records real 0 m 11.530 s

Why is getting late as a whole. In particular, the first does not cross the shard, and there is no shard on a server different from the server that threw the query.

About shard

From influx db 0.5 the data is placed as described below.

https://groups.google.com/forum/#!msg/influxdb/3jQQMXmXd6Q/cGcmFjM- f8YJ

That is,

  • By looking at the value of the incoming value, the shard is divided every fixed time (the initial setting is seven days). Leveldb comes out for each shard
  • The server in charge is decided for each shard
  • The shard is copied by the replication factor. That is, the number of shard's servers will increase
  • Shard is further divided by the value split
  • Which split is entered is determined by the hash value (database, seriesName)
  • If you set the value split-random, it will be decided as random (it can also be specified by regex?)

To be derived from now,

  • The speed of one query does not change. As it seems, since only one machine is querying, it is decided by leveldb after all
  • The size (period) of shard is also important, it is quick if it is within 1 shard, but it becomes late when query is crossing shard.
  • Increasing the replication factor increases copy, so the parallel reading performance increases. However, the number of writes also increases
  • If split is set, write performance will be improved because it can be further distributed within shard. However, since it will be split, the query crossing it will probably be slow

Is it such a feeling? I have not confirmed properly, so there may be mistakes.

Regarding writing this time, it seems that shards allocated to the other one did not write to disk, so it became faster by that much.

However, I do not know that the query that should not cross the shard got delayed. We will wait for others to verify that.

So, it is not very neat verification, but in the atmosphere.

How to setup influx db cluster

influxdb supports clustering. but I could not find how to setup a cluster.

finally I found this mail and that is work.

How to

When you want to clustering up, edit your config.toml.

  1. For the first node, commented out the seed-servers line

  2. For the later nodes, set the first node address to the seed-servers like this

    seed-servers = ["(FIRST_NODE_IP_ADDRESS):8090"]
    

That's all. After the cluster is up successfully, you can find these servers in "Cluster" menu on these node's Web UI.

How to switch the execution target with ansible

We have multiple environments such as production environment, verification environment, development environment, etc. I think that it is common that servers and settings are different for each environment.

Since there are multiple ways to switch these targets with ansible, I would like to introduce it.

If the server is different but the settings are the same

In this case, I think that it is better to switch the group by using the inventory file.

[stg:children]
stg_Web
stg_DB

[prod:children]
prod_Web
prod_DB

[stg_web]
stg_web_01

[stg_DB]
stg_db_01

[prod_web]
prod_web_01

[prod_DB]
prod_db_01

It is written as follows,

% ansible-playbook -l prod something.yml

In feeling that -l should I switch the group. Groups can also be nested like the example above, so there can be no problems if there are multiple groups in one environment.

Switch execution task

I think that there is a possibility to switch only one last task. In that case, when you can switch between using.

- name: 本番環境用ファイルをコピー
  copy: src=config.ini.prod dest=/etc/config.ini
  when: target == 'prod'

- name: 検証環境用ファイルをコピー
  copy: src=config.ini.stg dest=/etc/config.ini
  when: target == 'stg'

If so, as before

% ansible-playbook -l prod something.yml

, The file corresponding to the execution target is copied.

Here, the target is entered in target.

I want to switch with tags

The Ansible --tags can be tagged at runtime options.

- name: 本番環境用ファイルをコピー
  copy: src=config.ini.prod dest=/etc/config.ini
  tags:
    - prod

- name: 検証環境用ファイルをコピー
  copy: src=config.ini.stg dest=/etc/config.ini
  tags:
    - stg

If you do,

% ansible-playbook -l stg -t prod something.yml

, You can copy the file for the production environment to the verification environment, which could not be done with inventory alone, for example. Well, occasionally it may make sense.

Toggle switch with include

In past examples, it was necessary to tag with task unit. However, it will be hard if there are many tasks that I want to switch.

Different tasks are cut out,

- include: prod.yml  tags=prod
- include: stg.yml  tags=stg

Ya

- include: prod.yml when: target == 'prod'
- include: stg.yml  when: target == 'stg'

You can also toggle by include at the same time, as in.

Switch by role

As you get a bit bigger it is better to isolate the role, so I think that some people are cutting to role.

Since you can also apply when to role,

roles:
  - { role: prod, when: target == 'prod' }
  - { role: stg, when: target == 'stg' }

It is also possible to do. However, since we need to create two roles called prod and stg, we will normally define variables. I will try to be able to switch by tag at the same time.

roles:
  - { role: something, port:80, tags: ['prod'] }
  - { role: something, port:8080, tags: ['stg'] }

Switch vars

If you cut out the variable in the first place, you just need to replace vars. But vars can not write when, etc.

In and say if you want to, include_vars モジュール use.

tasks:
  - include_vars: vars_prod.yml
    when: target == "prod"
  - include_vars: vars_stg.yml
    when: target == "stg"

  - debug: msg=" port is {{ port }}"

Note

I tried it with 1.6.2 and when I combined it with tags, it seems that the subsequent task will not be executed. I'll chase a bit later.

This and the vars_prompt The combination of

---
- hosts: all
  gather_facts: no
  vars_prompt:
    env: "What is your target env?"
  tasks:
    - include_vars: vars_prod.yml
      when: env == "prod"
    - include_vars: vars_stg.yml
      when: env == "stg"
    - debug: msg="port is {{ port }}"

When asked "What is your target env?" When asked when playbook is running, if you enter, the specified vars file will be loaded. Well, it does not make much sense, is it so ...

Summary

I think that it is common when you want to change the target. I introduced some methods that I can use at that time.

There are several ways, but there is nothing to be particularly particular about. Strongly speaking, it will be more convenient to use roles and variables later, but there is nothing to do with this. I think that you should take a method suitable for the environment you want to use.

Using rotunicode

I'll write it for a memorandum someday when I use it.

A Python library named rotunicode is released from the box.

https://github.com/box/rotunicode

This is a library that converts the ASCII string of latin-1 into a Unicode string similar to an ASCII string.

By registering in codec like this, you can specify it with encode function.

import codecs
from box.util.rotunicode import RotUnicode

codecs.register(RotUnicode.search_function)

print('Testing rotunicode'.encode('rotunicode'))
>>> Ƭȅŝƭȉńġ ŕőƭȕńȉćőďȅ
print('Ƭȅŝƭȉńġ ŕőƭȕńȉćőďȅ'.decode('rotunicode'))
>>> Testing rotunicode

What do you use?

It is easy to test whether Unicode works properly.

import os, errno

name = 'foo'.encode('rotunicode')
os.mkdir(name)

print(name)
>>> ƒőő

At this time, if you use a Japanese character string such as "AIUEOO", it may seem that from an English speaker it seems like "I do not want to use such a character that I have never seen before".

However, using this rotunicodee will result in a Unicode string similar to the original ASCII string. Therefore, we can expect even though the effect that it may think that "that, this may be used somewhere ...". maybe…

Execute ansible directly to the docker container

Note

まだ試作段階ということをご承知おきください。ツッコミ大歓迎です。

To build a docker image, build it using Dockerfile. However, since Dockerfile is almost a simple shell script, it is hard to write variously. Therefore, means to construct an image using packer is taken. However, since the ansible provisioner of packer executes ansible-pull internally, it is necessary to put the ansible execution environment and git in the image.

Also, in order to execute the command against the activated container, you need to enter it with ssh. This means that you need to include sshd and expose the port of ssh, and you need to keep track of the docker's outer ssh port, which changes dynamically.

In order to solve these problems, ansible has created a plug-in that interacts directly with the docker container. (Note : Ansible is can also be used other than ssh, further that part I can be optionally added has become a plug-in structure)

Access to docker with lxc

Since docker 0.9.0, a library named libcontainer is entered and docker no longer depends on LXC. However, at the moment docker normally thinks that LXC is used. Therefore, this plugin will be accessed using LXC.

Premise

  • Docker 0.9.1
  • Put in lxc
  • Linux 3.8 or higher
  • Image contains python 2 (ansible does not correspond to python 3. Unfortunately)
  • Make sure / usr / bin / tee is in image (/ bin / tee not)

Preparation

1. You have launched on lxc driver docker -e lxc 2. You have to start the container in Run 3. Docker このgist from Get two files and place them

10021756/raw/010011595e2c21015357f60b9c1d4439b5594f46/docker_connec tion.py>`_ in the area of the inventory file. chmod ugo + x to keep with the execution authority - docker_inventory.py is put under the directory called connection_plugins

As a result, this is the configuration.

|- docker_inventory.py
|- connection_plugins
|  |
|  +- docker_connection.py
+- なにか.yml

Execution

The only difference from normal ansible execution is that you specify docker_inventory.py in inventory.

% ansible-playbook -i docker_inventory.py  なにか.yml

This will execute the specified playbook for the group named docker. Since you are calling sudo on the host side, it may be necessary to execute sudo and cache it before execution.

By the way, because it is not via ssh, execution is quite early.

Internal details

The ansible libvirtを使ったlxcアクセスのプラグイン it has been built already However, in the environment at hand, I use lx - attach because I do not need to use docker 's LXC from virsh and libvirt if it is LXC decisions. Or, I use this libvirt_lxc.py almost as it is.

Linux 3.8 or higher is required to execute arbitrary commands with lxc - attach.

container ID of the docker is docker ps --no-trunc can get in, you can access to the container in a LXC using this container ID.

Actually, I tried to group iamge names as groups, not docker, but since I could only get the ID of image, I left it behind.

Summary

Although it is a prototype, I created a plugin that can run ansible directly against the active docker container. I think that you can create an image by committing after running ansible-playbook.

Also, because it is fast to generate and execute, I think that it can be used for trial and error when creating ansible playbook.

In addition, dockerモジュール by combining it with, you may be more complex operation.

Finally, 助言 thanks to mopemope who got the.

CoreOS image creation

CoreOS can also create images yourself. You can say that you can set your password by creating your own image, or put the software in advance on the host side. (I do not know much benefit)

The concrete procedure of image creation is

https://coreos.com/docs/sdk-distributors/sdk/modifying-coreos/

It is as written on. This time I tried this as it was.

Work content

Premise

Distribution is not required for machines necessary for creating images, as long as they are x86-64 Linux.

Other

  • Git
  • Curl

Is required. However, it is in the main system, is not it?

Insert depot_tools

CoreOS builds using repo. This is what you are using for Android. Clone depot_tools and pass it through PATH.

% git clone
https://chromium.googlesource.com/chromium/tools/depot_tools.git
% export PATH="$PATH":`pwd`/depot_tools

Create SDK chroot

% mkdir coreos; cd coreos
# .repoディレクトリを初期化します
% repo init -u https://github.com/coreos/manifest.git -g minilayout --repo-url \
https://chromium.googlesource.com/external/repo.git
# manifestで指定されたgit repoをすべて取得します (結構な時間がかかります)
% repo sync

Create an image

I will download sdk. (Sudo is required)

% ./chromite/bin/cros_sdk

This also takes quite a while but if you wait for a while, it will come out like this.

cros_sdk:make_chroot: All set up.  To enter the chroot, run:
$ cros_sdk --enter

CAUTION: Do *NOT* rm -rf the chroot directory; if there are stale bind
mounts you may end up deleting your source tree too.  To unmount and
delete the chroot cleanly, use:
$ cros_sdk --delete

(cr) ((c15791e...)) r_rudi@hamspam ~/trunk/src/scripts $

To set the "core" user's password type: (It is in coreos / chroot / usr / lib64 / crosutils)

./set_shared_user_password.sh

To make this image target amd64: It is possible to specify x86 and so on.

echo amd64-usr > .default_board

Board root To set the file system to / build / $ {BOARD}, type:

./setup_board

After a while, this will come out and complete.

INFO    setup_board: Elapsed time (setup_board): 4m16s
INFO    setup_board: The SYSROOT is: /build/amd64-usr

Build the binary package. This also takes about 30 minutes.

./build_packages
(省略)
>>> Using system located in ROOT tree /build/amd64-usr/

>>> No outdated packages were found on your system.
INFO    build_packages: Checking build root
INFO    build_packages: Builds complete
INFO    build_packages: Elapsed time (build_packages): 27m55s

When finished I will create a binary package with developer overlay (?

./build_image --noenable_rootfs_verification dev

(省略)
2014/04/02 06:12:44 - generate_au_zip.py - INFO    : Generated
/mnt/host/source/src/build/images/amd64-usr/274.0.0+2014-04-02-0604-a1/au-generator.zip
COREOS_BUILD=274
COREOS_BRANCH=0
COREOS_PATCH=0
COREOS_SDK_VERSION=273.0.0
Done. Image(s) created in
/mnt/host/source/src/build/images/amd64-usr/274.0.0+2014-04-02-0604-a1
Developer image created as coreos_developer_image.bin
To convert it to a virtual machine image, use:
  ./image_to_vm.sh --from=../build/images/amd64-usr/274.0.0+2014-04-02-0604-a1 --board=amd64-usr
The default type is qemu, see ./image_to_vm.sh --help for other  options.

Since it was raw bin img, you can do it with. / Image_to_vm.sh.

./image_to_vm.sh --format virtualbox

(省略)
INFO    build_image: Elasr/274.0.0+2014-04-02-0604-a1
INFO    image_to_vm.sh:  - coreos_developer_virtualbox_image.vmdk
INFO    image_to_vm.sh:  - coreos_developer_virtualbox.ovf
INFO    image_to_vm.sh:  - coreos_developer_virtualbox.README
Copy coreos_developer_virtualbox_image.vmdk and
coreos_developer_virtualbox.ovf to a VirtualBox host and run:
VBoxManage import coreos_developer_virtualbox.ovf

Although it appears at the end of build_image, qemu's image is created in default, but you can create other images like this one. Specifically, this version is like this. It seems that vagrant, ami, gce, etc. can also be done.

  • Ami
  • Pxe
  • Iso
  • Openstack
  • Qemu
  • Qemu_no_kexec
  • Rackspace
  • Rackspace_vhd
  • Vagrant
  • Vagrant_vmware_fusion
  • Virtualbox
  • Vmware
  • Vmware_insecure
  • Xen
  • Gce

Since this time chose the virtualbox, coreos_developer_virtualbox_image.vmdk and coreos_developer_virtualbox.ovf file that it was completed.

% cd ~/trunk/src/build/images/amd64-usr/274.0.0+2014-04-02-0604-a1
% ls -l
-rw-r--r-- 1 r_rudi r_rudi 7.7M  4  2 06:12 au-generator.zip
-rw-r--r-- 1 r_rudi r_rudi 2.9G  4  2 06:12 coreos_developer_image.bin
-rw-r--r-- 1 r_rudi r_rudi  160  4  2 06:27 coreos_developer_virtualbox.README
-rw-r--r-- 1 r_rudi r_rudi  11K  4  2 06:27 coreos_developer_virtualbox.ovf
-rw-r--r-- 1 r_rudi r_rudi 471M  4  2 06:27 coreos_developer_virtualbox_image.vmdk
drwxr-xr-x 2 root   root   4.0K  4  2 06:04 rootfs
-rw-r--r-- 1 r_rudi r_rudi   75  4  2 06:12 version.txt

This is, ホスト側 In coreos/src/build/images/amd64-usr/274.0.0+2014-04-02-0604-a1 / amd64-usr / 274.0.0 + 2014-04-02-0604-a1` you Ri per in.

It is OK if you start this with virtualbox. The user name core and password can log in with the password you set yourself.

Summary

Actually, I have not used CoreOS properly yet, but I tried to create the first image.

The build script was properly maintained, and there was not anything to do with it at all.

Ansible 1.5 release

Ansible 's 1.5 was released.

There are a lot of changes at 1.5, some of which are all there.

  • Parameter encryption with ansible vault
  • Acceleration by ssh pipelineing
  • It was abolished from 1.4 and warning was issued when_ XXX was completely deleted. Instead of using when.
  • Only _if will be deprecated.
  • Add no_log option to stop logging
  • Added parameters to git module (accept_hostkey, key_file, ssh_opts)
  • Addition of various modules

Among these many changes, we will explain about ansible vault, ssh pipelining, added assert module in this article.

Ansible vault

There are many things you would like to write playbook using information you want to keep secret such as password and API key. Chef has a function to encrypt databag. Ansible vault function has been added to Ansible this time.

ansible-vault use the command, to encrypt the files that are specified in the vars_file.

  • Create encrypted file

    If you do not have anything you can create with create.

    Code-Block .. :: Bash

    Ansible-Vault Pasento Create Vars.Yml Vault Password : Confirm Password Vault : (input so here is defined by the EDITOR The editor is open)

  • encryption

    • Ansible-vault encrypt vars.yml

    Then encrypt the plaintext file. By the way, encrypting the encrypted file with ansible-vault again will make me angry properly so please be relieved.

  • Decryption

    • Ansible-vault decrypt vars.yml

    You can decode vars.yml with. Please be aware that the vars.yml file itself is replaced

  • Edit

    • Ansible-vault edit vars.yml

    The editor will be launched just like when creating it.

  • change Password

    • Ansible-vault rekey vars.yml

    You can change your password

I use this file encrypted with these commands.

---
- hosts: localhost
  vars_files:
      - vars.yml
  tasks:
    - debug: msg="{{ spam }}"

Sure you have this, at the time of execution --ask-vault-pass and put on, you have have heard the password. If you do not attach it, you get an error if it is encrypted.

% ansible-playbook -i ansible_hosts  vault.yml --ask-vault-pass
Vault password:

Alternatively, you can specify a password file as follows.

% ansible-playbook -i ansible_hosts  vault.yml --vault-password-file ~/.ssh/pass.txt

-–vault-password-file because you can specify any ansible-pull, is useful for people who are to start the ansible-pull in the cron.

vault of the document is ここ located in the.

However, this ブログ in but have been mentioned, ansible vault will whole encryption. Therefore, you can not say, grep by variable name, for example.

There is a possibility that it will be improved in the future, but in the present situation it may be nice to remember that there are such problems.

SSH Pipelining

Ansible writes out module as a script file, forwards it to remote, and then executes it. SSH Pipelining speeds up execution by reducing the number of executions of ssh and actual file transfer. (I'm sorry, I have not chased the details.)

This effect is quite high, and the execution time that took 1 minute is 30 seconds.

The initial setting of pipelining is invalidated, but if you set ansible.cfg like this it will be activated.

[defaults]
pipelining=True

Warning

sudo使用時の注意

If you use the sudo リモート側の /etc/sudoers in requiretty you need to disable.

Assert module

But has been added to the various module at every time of the release, I thought and I think we personally best this time is this assert module is .

The assert module is used as follows.

  • Assert : {That : "! Ansible_os_family = 'RedHat'"}

Alternatively, you can check more than one.

assert:
   that:
     - "'foo' in some_command_result.stdout"
     - "number_of_the_counting == 3"

Actually this is the same as adding fail to the fail module, but using assert makes it easier to write. However, it is a bit of a feeling that "that" is necessary. Also, there are places that must be surrounded by "(that is, they were not worse than I thought)

By the way it is the execution result.

_____________________________________________________
< TASK: assert that="ansible_os_family == \"Ubuntu\"" >
 -----------------------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

failed: [localhost] => {"assertion": "ansible_os_family == \"Ubuntu\"", "evaluated_to": false, "failed": true}

With successfully assert module, to check the state of the server serverspec and envassert such as another of You may not need a mechanism. (We will investigate whether we can check port numbers ...)

Summary

Ansible 1.5 has been released and introduced some of the important functions.

In the 1.6 release, it is predicted that parameter references in the form of $ foo and $ {foo} are completely deleted. Let's be aware of writing from now.

What I have studied about riak's handoff

For source code reading, we decided to examine riak's handoff in detail.

Or I think that it is probably useless for others as it is a summary for me in order to grasp the flow of processing.

About handoff

When ring_update is called and the ring is updated, node rearrangement will occur, so the range of the ring handled by each vnode also changes. Along with that, it is necessary to send the actual data that each vnode has to the new responsible vnode. This is called handoff.

Process flow

  1. riak_core_vnode_manager:schedule_management_timer/0

    A timer is set with send_after, and handle_info is called for every management_tick.

    Code-Block .. :: erlang

    Schedule_management_timer () ->
    ManagementTick = app_helper: get_env (riak_core,

    Vnode_management_timer, 10000),

    Erlang: send_after (ManagementTick,? MODULE, management_tick).

  2. riak_core_vnode_manager:handle_info(management_tick, State0)

    It checks whether ring has changed in this function called at regular intervals.

    Code-Block .. :: erlang

    RingID = riak_core_ring_manager: get_ring_id (), {ok, Ring, CHBin} = riak_core_ring_manager: get_raw_ring_chashbin (), State = maybe_ring_changed (RingID, Ring, CHBin, State 0)

    I call ring_update in maybe_ring_changed. After that, this function is done by checking the ring by calling check_repairs / 1 etc.

  3. riak_core_vnode_manager:maybe_ring_changed/4

    It checks whether RingID is last_ring_id, and determines whether the ring has changed. If it is not, call ring_changed / 3.

    Code-Block .. :: erlang

    Case RingID of
    LastID ->

    Maybe_ensure_vnodes_started (Ring), State;

    _ ->

    Ensure_vnodes_started (Ring), State 2 = ring_changed (Ring, CHBin, State), State 2 # state {last_ring_id = RingID}

    TODO : find out why it can be determined whether or not the ring has changed in this.

    (03/03 postscript) : A RingID, a unique ID in the entire Ring ではなく of vnode that each node have in their own id. It is incremented if there is a change. Therefore, if RingID and LastID are different, it means that a change has occurred.

  4. riak_core_vnode_manager:ring_changed/3

    Code-Block .. :: erlang

    State 2 = update_forwarding (AllVNodes, Mods, Ring, State),

    %% Update handoff state State 3 = update_handoff (AllVNodes, Ring, CHBin, State 2),

    %% Trigger ownership transfers. Transfers = riak_core_ring: pending_changes (Ring), trigger_ownership_handoff (Transfers, Mods, Ring, State 3),

    It updates the state of forward and the state of handoff and calls trigger_ownership_handoff / 4. The state here is whether forward or handoff should be done.

  5. riak_core_vnode_manager:trigger_ownership_handoff/4

    Code-Block .. :: erlang

    Throttle = limit_ownership_handoff (Transfers, IsResizing), Awaiting = [{Mod, Idx} || {Idx, Node, _, CMods, S} <- Throttle,

    Mod <- Mods, S =: = awaiting, Node =: = node (), not lists: member (Mod, CMods)],

    [Maybe_trigger_handoff (Mod, Idx, State) || {Mod, Idx} <- Awaiting],

  6. riak_core_vnode_manager:maybe_trigger_handoff/3

    Riak_core_vnode: Call trigger_handoff / 3.

    Code-Block .. :: erlang

    Case dict: find ({Mod, Idx}, HO) of
    {Ok, '$ resize'} ->

    {Ok, Ring} = riak_core_ring_manager: get_my_ring (), case riak_core_ring: awaiting_resize_transfer (Ring, {Idx, node ()}, Mod) of

    Undefined -> ok; {TargetIdx, TargetNode} ->

    Riak_core_vnode: trigger_handoff (Pid, TargetIdx, TargetNode)

    End;

    {Ok, '$ delete'} ->

    Riak_core_vnode: trigger_delete (Pid);

    {Ok, TargetNode} ->

    Riak_core_vnode: trigger_handoff (Pid, TargetNode), ok;

  7. trigger_handoff/3

    Call trigger_handoff event.

    Code-Block .. :: erlang

    Trigger_handoff (VNode, TargetIdx, TargetNode) ->

    Gen_fsm: send_all_state_event (VNode, {trigger_handoff, TargetIdx, TargetNode}).

  8. riak_core_vnode:active({trigger_handoff,

    Call maybe_handoff / 3.

    Code-Block .. :: erlang

    Active ({trigger_handoff, TargetIdx, TargetNode}, State) ->

    Maybe_handoff (TargetIdx, TargetNode, State);

  9. riak_core_vnode:maybe_handoff/3

    Define HOType by judging whether it is resize_transfer or hinted_handoff from Resizing and Primary. Call start_handoff / 4 with its HOType as an argument.

  10. riak_core_vnode:start_handoff/4

    Call start_outbound / 4

  11. riak_core_vnode:start_outbound/4

    Riak_core_handoff_manager: call add_outbound / 7.

    Code-Block .. :: erlang

    Case riak_core_handoff_manager: add_outbound (HOType, Mod, Idx, TargetIdx, TargetNode, self (), Opts) of

    {Ok, Pid} ->
    State # state {handoff_pid = Pid,

    Handoff_type = HOType, handoff_target = {TargetIdx, TargetNode}};

  12. riak_core_handoff_manager:add_outbound/7

    Call send_handoff / 8.

    Code-Block .. :: erlang

    Case send_handoff (Type, {Mod, SrcIdx, TargetIdx}, Node, Pid, HS, Opts) of

    {Ok, Handoff = # handoff_status {transport_pid = Sender}} ->

    HS2 = HS ++ [Handoff], {reply, {ok, Sender}, State # state {handoffs = HS 2}};

  13. riak_core_handoff_manager:send_handoff/8

    Riak_core_handoff_manager: Also called from xfer / 3 and handle_cast {send_handoff.

    Inside of this, I check if other handoff is running. (Should Handoff = per)

    Then, if you know that you can actually run handoff, build a monitor and launch the sender process. Specifically, we call riak_core_handoff_sender_sup: start_sender. At this time, filter etc. are changed according to HandoffType (HOType).

    Code-Block .. :: erlang

    Case HOType of

    Case HOType of

    HOAcc 0 = undefined, HONotSentFun = undefined;

    HONotSentFun = undefined;

    Resize_transfer ->

    {Ok, Ring} = riak_core_ring_manager: get_my_ring (), HOFilter = resize_transfer_filter (Ring, Mod, Src, Target), HOAcc 0 = ordsets: new (), HONotSentFun = resize_transfer_notsent_fun (Ring, Mod, Src);

    _ ->

    HOFilter = none, HOAcc 0 = undefined, HONotSentFun = undefined

    End, HOOpts = [{filter, HOFilter},

    {Notsent_acc 0, HOAcc 0}, {notsent_fun, HONotSentFun} | BaseOpts],

    {Ok, Pid} = riak_core_handoff_sender_sup: start_sender (HOType,

    M o d , N o d e , V n o d e , H O O p t s ) ,

    PidM = monitor (process, Pid),

  14. riak_core_handoff_sender_sup:start_sender

    Call start_fold / 5 with start_link.

  15. riak_core_hanoff_sender:start_fold/5

    First we will check last that you can handoff on maybe_call_handoff_started / 3.

    Next, I actually connect with the handoff destination.

    Code-Block .. :: erlang

    True ->

    {Ok, Skt} = gen_tcp: connect (TNHandoffIP, Port, SockOpts, 15000), {Skt, gen_tcp}

    End,

    I will send you a message.

    Code-Block .. :: erlang

    ModBin = atom_to_binary (Module, utf 8), Msg = <<? PT_MSG_OLDSYNC: 8, ModBin / binary >>, ok = TcpMod: send (Socket, Msg),

    Receive the message sync from transfer destination. Receiving this means that the transfer destination has not refused.

    Code-Block .. :: erlang

    Case TcpMod: recv (Socket, 0, RecvTimeout) of

    {Error, closed} -> exit ({shutdown, max_concurrency} -> exit ({shutdown, timeout} )

    End,

    Check that the receiver can receive batching messages.

    Code-Block .. :: erlang

    RemoteSupportsBatching = remote_supports_batching (TargetNode),

    Send the Partition ID,

    Code-Block .. :: erlang

    M = <<? PT_MSG_INIT: 8, TargetPartition: 160 / integer >>, ok = TcpMod: send (Socket, M),

    Create fold_req. I pass the function visit_item / 3 here, but in this function called visit_item / 3, I actually send the data.

    Code-Block .. :: erlang

    Req = riak_core_util: make_fold_req (

    Fun visit_item / 3, # ho_ acc {ack = 0,

    Call sync_command. Comments are also worrisome here ...

    Code-Block .. :: erlang

    %% IFF the vnode is using an async worker to perform the fold %% then sync_command will return error on vnode crash, %% otherwise it will wait forever but vnode crash will be %% caught by handoff manager. I know, this is confusing , A%% new handoff system will be written soon enough.

    AccRecord 0 = riak_core_vnode_master: sync_command ({SrcPartition, SrcNode},

    Req, VMaster, infinity),

    Finally, if it is still in the buffer, call send_objects / 2.

    Code-Block .. :: erlang

    Send Any Straggler Pasentopasento Entries Remaining In The Buffer : AccRecord = Send_objects (AccRecord0 # Ho_acc.Item_queue, AccRecord0),

    If there is no error, send a message saying sync and confirm that it returns. When I came back, the transfer was successful.

    Code-Block .. :: erlang

    Lager: debug ("~ p ~ p Sending final sync",

    [SrcPartition, Module]),

    Ok = TcpMod: send (Socket, <<? PT_MSG_SYNC: 8 >>),

    Case TcpMod: recv (Socket, 0, RecvTimeout) of
    {Ok, [? PT_MSG_SYNC | << "sync" >>]} ->
    Lager: debug ("~ p ~ p Final sync received",

    [SrcPartition, Module]);

    {Error, timeout} -> exit ({shutdown, timeout})

    End,

    Finally, resize_transfer_complete or handoff_complete events are activated depending on the type and it is over.

    Code-Block .. :: erlang

    Case Type of repair -> ok; resize_transfer -> gen_fsm: send_event (ParentPid, {resize_transfer_complete,

    N o t S e n t A c c } ) ;

    _ -> gen_fsm: send_event (ParentPid, handoff_complete)

    End;

  16. riak_core_handoff_manager:visit_item/3

    Visit_item has two things to send with sync and one to send with async.

    In the case of sync, I send it one by one like this and recv it.

    Code-Block .. :: erlang

    Case TcpMod: send (Sock, M) of
    Ok ->
    Case TcpMod: recv (Sock, 0, RecvTimeout) of
    {Ok, [? PT_MSG_OLDSYNC | << "sync" >>]} ->

    Acc2 = Acc # ho_acc {ack = 0, error = ok, stats = Stats 3}, visit_item (K, V, Acc 2);

    Otherwise, it passes through filter, determines whether to send in batch, and sends it in fact.

    Code-Block .. :: erlang

    Case Module: encode_handoff_item (K, V) of

    Acc;

    Acc;

    Case ItemQueueByteSize2 = <HandoffBatchThreshold of

    Case ItemQueueByteSize2 = <HandoffBatchThreshold of

    True -> Acc 2 # ho_acc {item_queue = ItemQueue 2}; false -> send_objects (ItemQueue 2, Acc 2)

    End;

    In the case of UseBatching, it checks whether it is within HandoffBatchTrheshold, if it is within, it gathers up, if it is exceeded, it sends it using send_objects / 2.

  17. riak_core_handoff_manager:send_objects/2

    It is a function that actually sends data. It is like this.

    Code-Block .. :: erlang

    M = <<? PT_MSG_BATCH: 8, ObjectList / binary >>,

    NumBytes = byte_size (M),

    Stats 2 = incr_bytes (incr_objs (Stats, NObjects), NumBytes), Stats 3 = maybe_send_status ({Module, SrcPartition, TargetPartition}, Stats 2)

    Case TcpMod: send (Sock, M) of

When you receive

  1. riak_core_handoff_listener:new_connection/2

    gen_nb_server have to behavior the riak_core_handoff_listener is,

    When accepted, it calls riak_core_handoff_manager: add_inbound / 1.

    Code-Block .. :: erlang

    New_connection (Socket, State = # state {ssl_opts = SslOpts}) ->
    Case riak_core_handoff_manager: add_inbound (SslOpts) of
    {Ok, Pid} ->

    Gen_tcp: controlling_process (Socket, Pid), ok = riak_core_handoff_receiver: set_socket (Pid, Socket), {ok, State};

    {Error, _Reason} ->

    Riak_core_stat: update (rejected_handoffs), gen_tcp: close (Socket), {ok, State}

  2. riak_core_handoff_manager:add_inbound/1

    Call receive_handoff / 1.

    Code-Block .. :: erlang

    {Reply, Error, State}

    End;

  3. riak_core_handoff_manager:receive_handoff/1

    Call handoff_concurrency_limit_reached / 0 to check the parallel number, then call riak_core_handoff_receiver_sup: start_receiver to invoke receiver process.

    Code-Block .. :: erlang

    {Error, max_concurrency};

    False ->

    {Ok, Pid} = riak_core_handoff_receiver_sup: start_receiver (SSLOpts), PidM = monitor (process, Pid),

    %% successfully started up a new receiver {ok, # handoff_status {transport_pid = Pid,

  4. riak_core_handoff_receiver at do the processing according to the msg.

    INIT

    Code-Block .. :: erlang

    Process_message (? PT_MSG_INIT, MsgData, State = # state {vnode_mod = VNodeMod,

    Peer = Peer}) ->

    Partition: 160 / integer >> = MsgData, lager: info ("Receiving handoff data for partition ~ p: ~ p from ~ p", [VNodeMod, Partition, Peer]), {ok, VNode} = riak_core_vnode_master: get_vnode_pid (Partition, VNodeMod), Data = [{mod_src_tgt, {VNodeMod, undefined, Partition}},

    {Vnode_pid, VNode}],

    Riak_core_handoff_manager: set_recv_data (self (), Data), State # state {partition = Partition, vnode = VNode};

    SYNC

    Code-Block .. :: erlang

    Process_message (? PT_MSG_SYNC, _MsgData, State = # state {sock = Socket,

    Tcp_mod = TcpMod}) ->

    TcpMod: send (Socket, <<? PT_MSG_SYNC: 8, "sync" >>), State;

    BATCH. In fact it is treated as a collection of PT_MSG_OBJ.

    Code-Block .. :: erlang

    Process_message (? PT_MSG_BATCH, MsgData, State) ->

    Lists: foldl (fun (Obj, StateAcc) -> process_message (? PT_MSG_OBJ, Obj, StateAcc) end,

    State, binary_to_term (MsgData));

    OBJ. The data received here is sent to gen_fsm 's event as Obj. ...

    Code-Block .. :: erlang

    Case gen_fsm: sync_send_all_state_event (VNode, Msg, 60000) of

    Ok ->

    State # state {count = Count + 1};

    E = {error, _} ->

    Exit (E)

    End;

Schematic flow

TBD

Riak_core_vnode_proxy

Tip

3/3 追記

Vnode_proxy is a place where all requests to vnode pass. It seems that this proxy will respond, for example, if the vnode can not notify you. (However, the actual code is not pursued)

In riak_core_ring_handler.erl, maybe_start_vnode_proxies (Ring) is called.

handle_event({ring_update, Ring}, State) ->
    maybe_start_vnode_proxies(Ring),
    maybe_stop_vnode_proxies(Ring),
    {ok, State}.

In maybe_start_vnode_proxies, if the destination ring is larger than the current one, we are building a proxy. However, I also stop right after that, I do not understand a bit better.

Riak_core_vnode_worker

Tip

3/3 追記

It is said that vnode_worker is one that enables asynchronous request by activating this worker when vnode performs time-consuming operation and letting worker do it. (However, the actual code is not pursued)