deep_scrub_choked

This commit is contained in:
a.pivkin 2026-04-06 07:08:17 +03:00
commit 9149dc1d7f
840 changed files with 52471 additions and 0 deletions

18
.deepsource.toml Normal file
View File

@ -0,0 +1,18 @@
version = 1
test_patterns = ["tests/**"]
exclude_patterns = [
"roles/**",
"profiles/**",
"infrastructure-playbooks/**",
"group_vars/**",
"contrib/**"
]
[[analyzers]]
name = "python"
enabled = true
[analyzers.meta]
runtime_version = "3.x.x"

View File

View File

View File

63
.mergify.yml Normal file
View File

@ -0,0 +1,63 @@
pull_request_rules:
# Backports
- actions:
backport:
branches:
- stable-3.0
conditions:
- label=backport-stable-3.0
name: backport stable-3.0
- actions:
backport:
branches:
- stable-3.1
conditions:
- label=backport-stable-3.1
name: backport stable-3.1
- actions:
backport:
branches:
- stable-3.2
conditions:
- label=backport-stable-3.2
name: backport stable-3.2
- actions:
backport:
branches:
- stable-4.0
conditions:
- label=backport-stable-4.0
name: backport stable-4.0
- actions:
backport:
branches:
- stable-5.0
conditions:
- label=backport-stable-5.0
name: backport stable-5.0
- actions:
backport:
branches:
- stable-6.0
conditions:
- label=backport-stable-6.0
name: backport stable-6.0
- actions:
backport:
branches:
- stable-7.0
conditions:
- label=backport-stable-7.0
name: backport stable-7.0
- actions:
backport:
branches:
- stable-8.0
conditions:
- label=backport-stable-8.0
name: backport stable-8.0
commands_restrictions:
backport:
conditions:
- base=main
- number<0

10
.readthedocs.yaml Normal file
View File

@ -0,0 +1,10 @@
version: 2
build:
os: "ubuntu-22.04"
tools:
python: "3.9"
sphinx:
# Path to your Sphinx configuration file.
configuration: docs/source/conf.py

101
CONTRIBUTING.md Normal file
View File

@ -0,0 +1,101 @@
# Contributing to ceph-ansible
1. Follow the [commit guidelines](#commit-guidelines)
## Commit guidelines
- All commits should have a subject and a body
- The commit subject should briefly describe what the commit changes
- The commit body should describe the problem addressed and the chosen solution
- What was the problem and solution? Why that solution? Were there alternative ideas?
- Wrap commit subjects and bodies to 80 characters
- Sign-off your commits
- Add a best-effort scope designation to commit subjects. This could be a directory name, file name,
or the name of a logical grouping of code. Examples:
- library: add a placeholder module for the validate action plugin
- site.yml: combine validate play with fact gathering play
- Commits linked with an issue should trace them with :
- Fixes: #2653
[Suggested reading.](https://chris.beams.io/posts/git-commit/)
## Pull requests
### Jenkins CI
We use Jenkins to run several tests on each pull request.
If you don't want to run a build for a particular pull request, because all you are changing is the
README for example, add the text `[skip ci]` to the PR title.
### Merging strategy
Merging PR is controlled by [mergify](https://mergify.io/) by the following rules:
- at least one approuval from a maintainer
- a SUCCESS from the CI pipeline "ceph-ansible PR Pipeline"
If you work is not ready for review/merge, please request the DNM label via a comment or the title of your PR.
This will prevent the engine merging your pull request.
### Backports (maintainers only)
If you wish to see your work from 'main' being backported to a stable branch you can ping a maintainer
so he will set the backport label on your PR. Once the PR from main is merged, a backport PR will be created by mergify,
if there is a cherry-pick conflict you must resolv it by pulling the branch.
**NEVER** push directly into a stable branch, **unless** the code from main has diverged so much that the files don't exist in the stable branch.
If that happens, inform the maintainers of the reasons why you pushed directly into a stable branch, if the reason is invalid, maintainers will immediatly close your pull request.
## Good to know
### Sample files
The sample files we provide in `group_vars/` are versionned,
they are a copy of what their respective `./roles/<role>/defaults/main.yml` contain.
It means if you are pushing a patch modifying one of these files:
- `./roles/ceph-mds/defaults/main.yml`
- `./roles/ceph-mgr/defaults/main.yml`
- `./roles/ceph-fetch-keys/defaults/main.yml`
- `./roles/ceph-rbd-mirror/defaults/main.yml`
- `./roles/ceph-defaults/defaults/main.yml`
- `./roles/ceph-osd/defaults/main.yml`
- `./roles/ceph-nfs/defaults/main.yml`
- `./roles/ceph-client/defaults/main.yml`
- `./roles/ceph-common/defaults/main.yml`
- `./roles/ceph-mon/defaults/main.yml`
- `./roles/ceph-rgw/defaults/main.yml`
- `./roles/ceph-container-common/defaults/main.yml`
- `./roles/ceph-common-coreos/defaults/main.yml`
You will have to get the corresponding sample file updated, there is a script which do it for you.
You must run `./generate_group_vars_sample.sh` before you commit your changes so you are guaranteed to have consistent content for these files.
### Keep your branch up-to-date
Sometimes, a pull request can be subject to long discussion, reviews and comments, meantime, `main`
moves forward so let's try to keep your branch rebased on main regularly to avoid huge conflict merge.
A rebased branch is more likely to be merged easily & shorter.
### Organize your commits
Do not split your commits unecessary, we are used to see pull request with useless additional commits like
"I'm addressing reviewer's comments". So, please, squash and/or amend them as much as possible.
Similarly, split them when needed, if you are modifying several parts in ceph-ansible or pushing a large
patch you may have to split yours commit properly so it's better to understand your work.
Some recommandations:
- one fix = one commit,
- do not mix multiple topics in a single commit,
- if you PR contains a large number of commits that are each other totally unrelated, it should probably even be split in several PRs.
If you've broken your work up into a set of sequential changes and each commit pass the tests on their own then that's fine.
If you've got commits fixing typos or other problems introduced by previous commits in the same PR, then those should be squashed before merging.
If you are new to Git, these links might help:
- [https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History](https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History)
- [http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html](http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html)

201
LICENSE Normal file
View File

@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [2014] [Sébastien Han]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

113
Makefile Normal file
View File

@ -0,0 +1,113 @@
# Makefile for constructing RPMs.
# Try "make" (for SRPMS) or "make rpm"
NAME = ceph-ansible
# Set the RPM package NVR from "git describe".
# Examples:
#
# A "git describe" value of "v2.2.0beta1" would create an NVR
# "ceph-ansible-2.2.0-0.beta1.1.el8"
#
# A "git describe" value of "v2.2.0rc1" would create an NVR
# "ceph-ansible-2.2.0-0.rc1.1.el8"
#
# A "git describe" value of "v2.2.0rc1-1-gc465f85" would create an NVR
# "ceph-ansible-2.2.0-0.rc1.1.gc465f85.el8"
#
# A "git describe" value of "v2.2.0" creates an NVR
# "ceph-ansible-2.2.0-1.el8"
DIST ?= "el8"
MOCK_CONFIG ?= "centos-stream+epel-8-x86_64"
TAG := $(shell git describe --tags --abbrev=0 --match 'v*')
VERSION := $(shell echo $(TAG) | sed 's/^v//')
COMMIT := $(shell git rev-parse HEAD)
SHORTCOMMIT := $(shell echo $(COMMIT) | cut -c1-7)
RELEASE := $(shell git describe --tags --match 'v*' \
| sed 's/^v//' \
| sed 's/^[^-]*-//' \
| sed 's/-/./')
ifeq ($(VERSION),$(RELEASE))
RELEASE = 1
endif
ifneq (,$(findstring alpha,$(VERSION)))
ALPHA := $(shell echo $(VERSION) | sed 's/.*alpha/alpha/')
RELEASE := 0.$(ALPHA).$(RELEASE)
VERSION := $(subst $(ALPHA),,$(VERSION))
endif
ifneq (,$(findstring beta,$(VERSION)))
BETA := $(shell echo $(VERSION) | sed 's/.*beta/beta/')
RELEASE := 0.$(BETA).$(RELEASE)
VERSION := $(subst $(BETA),,$(VERSION))
endif
ifneq (,$(findstring rc,$(VERSION)))
RC := $(shell echo $(VERSION) | sed 's/.*rc/rc/')
RELEASE := 0.$(RC).$(RELEASE)
VERSION := $(subst $(RC),,$(VERSION))
endif
ifneq (,$(shell echo $(VERSION) | grep [a-zA-Z]))
# If we still have alpha characters in our Git tag string, we don't know
# how to translate that into a sane RPM version/release. Bail out.
$(error cannot translate Git tag version $(VERSION) to an RPM NVR)
endif
NVR := $(NAME)-$(VERSION)-$(RELEASE).$(DIST)
all: srpm
# Testing only
echo:
echo COMMIT $(COMMIT)
echo VERSION $(VERSION)
echo RELEASE $(RELEASE)
echo NVR $(NVR)
clean:
rm -rf dist/
rm -rf ceph-ansible-$(VERSION)-$(SHORTCOMMIT).tar.gz
rm -rf $(NVR).src.rpm
dist:
git archive --format=tar.gz --prefix=ceph-ansible-$(VERSION)/ HEAD > ceph-ansible-$(VERSION)-$(SHORTCOMMIT).tar.gz
spec:
sed ceph-ansible.spec.in \
-e 's/@COMMIT@/$(COMMIT)/' \
-e 's/@VERSION@/$(VERSION)/' \
-e 's/@RELEASE@/$(RELEASE)/' \
> ceph-ansible.spec
srpm: dist spec
rpmbuild -bs ceph-ansible.spec \
--define "_topdir ." \
--define "_sourcedir ." \
--define "_srcrpmdir ." \
--define "dist .$(DIST)"
rpm: dist srpm
mock -r $(MOCK_CONFIG) rebuild $(NVR).src.rpm \
--resultdir=. \
--define "dist .$(DIST)"
tag:
$(eval BRANCH := $(shell git rev-parse --abbrev-ref HEAD))
$(eval LASTNUM := $(shell echo $(TAG) \
| sed -E "s/.*[^0-9]([0-9]+)$$/\1/"))
$(eval NEXTNUM=$(shell echo $$(($(LASTNUM)+1))))
$(eval NEXTTAG=$(shell echo $(TAG) | sed "s/$(LASTNUM)$$/$(NEXTNUM)/"))
if [[ "$(TAG)" == "$(git describe --tags --match 'v*')" ]]; then \
echo "$(SHORTCOMMIT) on $(BRANCH) is already tagged as $(TAG)"; \
exit 1; \
fi
if [[ "$(BRANCH)" != "master" || "$(BRANCH)" != "main" ]] && \
! [[ "$(BRANCH)" =~ ^stable- ]]; then \
echo Cannot tag $(BRANCH); \
exit 1; \
fi
@echo Tagging Git branch $(BRANCH)
git tag $(NEXTTAG)
@echo run \'git push origin $(NEXTTAG)\' to push to GitHub.
.PHONY: dist rpm srpm tag

10
README.rst Normal file
View File

@ -0,0 +1,10 @@
Ceph Ansible
==============
The project is still maintained for the time being but it is encouraged to migrate to `cephadm <https://docs.ceph.com/en/latest/cephadm/>`_.
Ansible playbooks for Ceph, the distributed object, block, and file storage platform.
Please refer to our hosted documentation here: https://docs.ceph.com/projects/ceph-ansible/en/latest/
You can view documentation for our ``stable-*`` branches by substituting ``main`` in the link
above for the name of the branch. For example: https://docs.ceph.com/projects/ceph-ansible/en/stable-8.0/

605
Vagrantfile vendored Normal file
View File

@ -0,0 +1,605 @@
# -*- mode: ruby -*-
# vi: set ft=ruby :
require 'yaml'
require 'resolv'
VAGRANTFILE_API_VERSION = '2'
if File.file?(File.join(File.dirname(__FILE__), 'vagrant_variables.yml')) then
vagrant_variables_file = 'vagrant_variables.yml'
else
vagrant_variables_file = 'vagrant_variables.yml.sample'
end
config_file=File.expand_path(File.join(File.dirname(__FILE__), vagrant_variables_file))
settings=YAML.load_file(config_file)
LABEL_PREFIX = settings['label_prefix'] ? settings['label_prefix'] + "-" : ""
NMONS = settings['mon_vms']
NOSDS = settings['osd_vms']
NMDSS = settings['mds_vms']
NRGWS = settings['rgw_vms']
NNFSS = settings['nfs_vms']
NRBD_MIRRORS = settings['rbd_mirror_vms']
CLIENTS = settings['client_vms']
MGRS = settings['mgr_vms']
PUBLIC_SUBNET = settings['public_subnet']
CLUSTER_SUBNET = settings['cluster_subnet']
BOX = ENV['CEPH_ANSIBLE_VAGRANT_BOX'] || settings['vagrant_box']
CLIENT_BOX = ENV['CEPH_ANSIBLE_VAGRANT_BOX'] || settings['client_vagrant_box'] || BOX
BOX_URL = ENV['CEPH_ANSIBLE_VAGRANT_BOX_URL'] || settings['vagrant_box_url']
SYNC_DIR = settings['vagrant_sync_dir']
MEMORY = settings['memory']
ETH = settings['eth']
DOCKER = settings['docker']
USER = settings['ssh_username']
DEBUG = settings['debug']
ASSIGN_STATIC_IP = !(BOX == 'openstack' or BOX == 'linode')
DISABLE_SYNCED_FOLDER = settings.fetch('vagrant_disable_synced_folder', false)
"#{PUBLIC_SUBNET}" =~ Resolv::IPv6::Regex ? IPV6 = true : IPV6 = false
$last_ip_pub_digit = 9
$last_ip_cluster_digit = 9
ansible_provision = proc do |ansible|
if DOCKER then
ansible.playbook = 'site-container.yml'
if settings['skip_tags']
ansible.skip_tags = settings['skip_tags']
end
else
ansible.playbook = 'site.yml'
end
# Note: Can't do ranges like mon[0-2] in groups because
# these aren't supported by Vagrant, see
# https://github.com/mitchellh/vagrant/issues/3539
ansible.groups = {
'mons' => (0..NMONS - 1).map { |j| "#{LABEL_PREFIX}mon#{j}" },
'osds' => (0..NOSDS - 1).map { |j| "#{LABEL_PREFIX}osd#{j}" },
'mdss' => (0..NMDSS - 1).map { |j| "#{LABEL_PREFIX}mds#{j}" },
'rgws' => (0..NRGWS - 1).map { |j| "#{LABEL_PREFIX}rgw#{j}" },
'nfss' => (0..NNFSS - 1).map { |j| "#{LABEL_PREFIX}nfs#{j}" },
'rbd_mirrors' => (0..NRBD_MIRRORS - 1).map { |j| "#{LABEL_PREFIX}rbd_mirror#{j}" },
'clients' => (0..CLIENTS - 1).map { |j| "#{LABEL_PREFIX}client#{j}" },
'mgrs' => (0..MGRS - 1).map { |j| "#{LABEL_PREFIX}mgr#{j}" },
}
if IPV6 then
ansible.extra_vars = {
cluster_network: "#{CLUSTER_SUBNET}/64",
journal_size: 100,
public_network: "#{PUBLIC_SUBNET}/64",
}
else
ansible.extra_vars = {
cluster_network: "#{CLUSTER_SUBNET}.0/24",
journal_size: 100,
public_network: "#{PUBLIC_SUBNET}.0/24",
}
end
# In a production deployment, these should be secret
if DOCKER then
ansible.extra_vars = ansible.extra_vars.merge({
containerized_deployment: 'true',
ceph_mon_docker_subnet: ansible.extra_vars[:public_network],
devices: settings['disks'],
radosgw_interface: ETH,
generate_fsid: 'true',
})
else
ansible.extra_vars = ansible.extra_vars.merge({
devices: settings['disks'],
radosgw_interface: ETH,
os_tuning_params: settings['os_tuning_params'],
})
end
if BOX == 'linode' then
ansible.sudo = true
# Use radosgw_address_block instead of radosgw_interface:
ansible.extra_vars.delete(:radosgw_interface)
ansible.extra_vars = ansible.extra_vars.merge({
cluster_network: "#{CLUSTER_SUBNET}.0/16",
devices: ['/dev/sdc'], # hardcode leftover disk
monitor_address_block: "#{PUBLIC_SUBNET}.0/16",
radosgw_address_block: "#{PUBLIC_SUBNET}.0/16",
public_network: "#{PUBLIC_SUBNET}.0/16",
})
end
if DEBUG then
ansible.verbose = '-vvvv'
end
ansible.limit = 'all'
end
def create_vmdk(name, size)
dir = Pathname.new(__FILE__).expand_path.dirname
path = File.join(dir, '.vagrant', name + '.vmdk')
`vmware-vdiskmanager -c -s #{size} -t 0 -a scsi #{path} \
2>&1 > /dev/null` unless File.exist?(path)
end
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
config.vm.box = BOX
config.vm.box_url = BOX_URL
config.ssh.insert_key = false # workaround for https://github.com/mitchellh/vagrant/issues/5048
config.ssh.private_key_path = settings['ssh_private_key_path']
config.ssh.username = USER
# When using libvirt, avoid errors like:
# "host doesn't support requested feature: CPUID.01H:EDX.ds [bit 21]"
config.vm.provider :libvirt do |lv|
lv.cpu_mode = 'host-passthrough'
lv.disk_driver :cache => 'unsafe'
lv.graphics_type = 'none'
lv.cpus = 2
end
# Faster bootup. Disables mounting the sync folder for libvirt and virtualbox
if DISABLE_SYNCED_FOLDER
config.vm.provider :virtualbox do |v,override|
override.vm.synced_folder '.', SYNC_DIR, disabled: true
end
config.vm.provider :libvirt do |v,override|
override.vm.synced_folder '.', SYNC_DIR, disabled: true
end
end
if BOX == 'openstack'
# OpenStack VMs
config.vm.provider :openstack do |os|
config.vm.synced_folder ".", "/home/#{USER}/vagrant", disabled: true
config.ssh.pty = true
os.openstack_auth_url = settings['os_openstack_auth_url']
os.username = settings['os_username']
os.password = settings['os_password']
os.tenant_name = settings['os_tenant_name']
os.region = settings['os_region']
os.flavor = settings['os_flavor']
os.image = settings['os_image']
os.keypair_name = settings['os_keypair_name']
os.security_groups = ['default']
if settings['os_networks'] then
os.networks = settings['os_networks']
end
if settings['os_floating_ip_pool'] then
os.floating_ip_pool = settings['os_floating_ip_pool']
end
config.vm.provision "shell", inline: "true", upload_path: "/home/#{USER}/vagrant-shell"
end
elsif BOX == 'linode'
config.vm.provider :linode do |provider, override|
provider.token = ENV['LINODE_API_KEY']
provider.distribution = settings['cloud_distribution'] # 'Ubuntu 16.04 LTS'
provider.datacenter = settings['cloud_datacenter']
provider.plan = MEMORY.to_s
provider.private_networking = true
# root install generally takes <1GB
provider.xvda_size = 4*1024
# add some swap as the Linode distros require it
provider.swap_size = 128
end
end
(0..NMONS - 1).each do |i|
config.vm.define "#{LABEL_PREFIX}mon#{i}" do |mon|
mon.vm.hostname = "#{LABEL_PREFIX}mon#{i}"
if ASSIGN_STATIC_IP && !IPV6
mon.vm.network :private_network,
:ip => "#{PUBLIC_SUBNET}.#{$last_ip_pub_digit+=1}"
end
# Virtualbox
mon.vm.provider :virtualbox do |vb,override|
vb.customize ['modifyvm', :id, '--memory', "#{MEMORY}"]
end
# VMware
mon.vm.provider :vmware_fusion do |v|
v.vmx['memsize'] = "#{MEMORY}"
end
# Libvirt
mon.vm.provider :libvirt do |lv,override|
lv.memory = MEMORY
lv.random_hostname = true
if IPV6 then
override.vm.network :private_network,
:libvirt__ipv6_address => "#{PUBLIC_SUBNET}",
:libvirt__ipv6_prefix => "64",
:libvirt__dhcp_enabled => false,
:libvirt__forward_mode => "veryisolated",
:libvirt__network_name => "ipv6-public-network",
:ip => "#{PUBLIC_SUBNET}#{$last_ip_pub_digit+=1}",
:netmask => "64"
end
end
# Parallels
mon.vm.provider "parallels" do |prl|
prl.name = "ceph-mon#{i}"
prl.memory = "#{MEMORY}"
end
mon.vm.provider :linode do |provider|
provider.label = mon.vm.hostname
end
end
end
(0..MGRS - 1).each do |i|
config.vm.define "#{LABEL_PREFIX}mgr#{i}" do |mgr|
mgr.vm.hostname = "#{LABEL_PREFIX}mgr#{i}"
if ASSIGN_STATIC_IP && !IPV6
mgr.vm.network :private_network,
:ip => "#{PUBLIC_SUBNET}.#{$last_ip_pub_digit+=1}"
end
# Virtualbox
mgr.vm.provider :virtualbox do |vb|
vb.customize ['modifyvm', :id, '--memory', "#{MEMORY}"]
end
# VMware
mgr.vm.provider :vmware_fusion do |v|
v.vmx['memsize'] = "#{MEMORY}"
end
# Libvirt
mgr.vm.provider :libvirt do |lv,override|
lv.memory = MEMORY
lv.random_hostname = true
if IPV6 then
override.vm.network :private_network,
:libvirt__ipv6_address => "#{PUBLIC_SUBNET}",
:libvirt__ipv6_prefix => "64",
:libvirt__dhcp_enabled => false,
:libvirt__forward_mode => "veryisolated",
:libvirt__network_name => "ipv6-public-network",
:ip => "#{PUBLIC_SUBNET}#{$last_ip_pub_digit+=1}",
:netmask => "64"
end
end
# Parallels
mgr.vm.provider "parallels" do |prl|
prl.name = "ceph-mgr#{i}"
prl.memory = "#{MEMORY}"
end
mgr.vm.provider :linode do |provider|
provider.label = mgr.vm.hostname
end
end
end
(0..CLIENTS - 1).each do |i|
config.vm.define "#{LABEL_PREFIX}client#{i}" do |client|
client.vm.box = CLIENT_BOX
client.vm.hostname = "#{LABEL_PREFIX}client#{i}"
if ASSIGN_STATIC_IP && !IPV6
client.vm.network :private_network,
:ip => "#{PUBLIC_SUBNET}.#{$last_ip_pub_digit+=1}"
end
# Virtualbox
client.vm.provider :virtualbox do |vb|
vb.customize ['modifyvm', :id, '--memory', "#{MEMORY}"]
end
# VMware
client.vm.provider :vmware_fusion do |v|
v.vmx['memsize'] = "#{MEMORY}"
end
# Libvirt
client.vm.provider :libvirt do |lv,override|
lv.memory = MEMORY
lv.random_hostname = true
if IPV6 then
override.vm.network :private_network,
:libvirt__ipv6_address => "#{PUBLIC_SUBNET}",
:libvirt__ipv6_prefix => "64",
:libvirt__dhcp_enabled => false,
:libvirt__forward_mode => "veryisolated",
:libvirt__network_name => "ipv6-public-network",
:ip => "#{PUBLIC_SUBNET}#{$last_ip_pub_digit+=1}",
:netmask => "64"
end
end
# Parallels
client.vm.provider "parallels" do |prl|
prl.name = "ceph-client#{i}"
prl.memory = "#{MEMORY}"
end
client.vm.provider :linode do |provider|
provider.label = client.vm.hostname
end
end
end
(0..NRGWS - 1).each do |i|
config.vm.define "#{LABEL_PREFIX}rgw#{i}" do |rgw|
rgw.vm.hostname = "#{LABEL_PREFIX}rgw#{i}"
if ASSIGN_STATIC_IP && !IPV6
rgw.vm.network :private_network,
:ip => "#{PUBLIC_SUBNET}.#{$last_ip_pub_digit+=1}"
end
# Virtualbox
rgw.vm.provider :virtualbox do |vb|
vb.customize ['modifyvm', :id, '--memory', "#{MEMORY}"]
end
# VMware
rgw.vm.provider :vmware_fusion do |v|
v.vmx['memsize'] = "#{MEMORY}"
end
# Libvirt
rgw.vm.provider :libvirt do |lv,override|
lv.memory = MEMORY
lv.random_hostname = true
if IPV6 then
override.vm.network :private_network,
:libvirt__ipv6_address => "#{PUBLIC_SUBNET}",
:libvirt__ipv6_prefix => "64",
:libvirt__dhcp_enabled => false,
:libvirt__forward_mode => "veryisolated",
:libvirt__network_name => "ipv6-public-network",
:ip => "#{PUBLIC_SUBNET}#{$last_ip_pub_digit+=1}",
:netmask => "64"
end
end
# Parallels
rgw.vm.provider "parallels" do |prl|
prl.name = "ceph-rgw#{i}"
prl.memory = "#{MEMORY}"
end
rgw.vm.provider :linode do |provider|
provider.label = rgw.vm.hostname
end
end
end
(0..NNFSS - 1).each do |i|
config.vm.define "#{LABEL_PREFIX}nfs#{i}" do |nfs|
nfs.vm.hostname = "#{LABEL_PREFIX}nfs#{i}"
if ASSIGN_STATIC_IP && !IPV6
nfs.vm.network :private_network,
:ip => "#{PUBLIC_SUBNET}.#{$last_ip_pub_digit+=1}"
end
# Virtualbox
nfs.vm.provider :virtualbox do |vb|
vb.customize ['modifyvm', :id, '--memory', "#{MEMORY}"]
end
# VMware
nfs.vm.provider :vmware_fusion do |v|
v.vmx['memsize'] = "#{MEMORY}"
end
# Libvirt
nfs.vm.provider :libvirt do |lv,override|
lv.memory = MEMORY
lv.random_hostname = true
if IPV6 then
override.vm.network :private_network,
:libvirt__ipv6_address => "#{PUBLIC_SUBNET}",
:libvirt__ipv6_prefix => "64",
:libvirt__dhcp_enabled => false,
:libvirt__forward_mode => "veryisolated",
:libvirt__network_name => "ipv6-public-network",
:ip => "#{PUBLIC_SUBNET}#{$last_ip_pub_digit+=1}",
:netmask => "64"
end
end
# Parallels
nfs.vm.provider "parallels" do |prl|
prl.name = "ceph-nfs#{i}"
prl.memory = "#{MEMORY}"
end
nfs.vm.provider :linode do |provider|
provider.label = nfs.vm.hostname
end
end
end
(0..NMDSS - 1).each do |i|
config.vm.define "#{LABEL_PREFIX}mds#{i}" do |mds|
mds.vm.hostname = "#{LABEL_PREFIX}mds#{i}"
if ASSIGN_STATIC_IP && !IPV6
mds.vm.network :private_network,
:ip => "#{PUBLIC_SUBNET}.#{$last_ip_pub_digit+=1}"
end
# Virtualbox
mds.vm.provider :virtualbox do |vb|
vb.customize ['modifyvm', :id, '--memory', "#{MEMORY}"]
end
# VMware
mds.vm.provider :vmware_fusion do |v|
v.vmx['memsize'] = "#{MEMORY}"
end
# Libvirt
mds.vm.provider :libvirt do |lv,override|
lv.memory = MEMORY
lv.random_hostname = true
if IPV6 then
override.vm.network :private_network,
:libvirt__ipv6_address => "#{PUBLIC_SUBNET}",
:libvirt__ipv6_prefix => "64",
:libvirt__dhcp_enabled => false,
:libvirt__forward_mode => "veryisolated",
:libvirt__network_name => "ipv6-public-network",
:ip => "#{PUBLIC_SUBNET}#{$last_ip_pub_digit+=1}",
:netmask => "64"
end
end
# Parallels
mds.vm.provider "parallels" do |prl|
prl.name = "ceph-mds#{i}"
prl.memory = "#{MEMORY}"
end
mds.vm.provider :linode do |provider|
provider.label = mds.vm.hostname
end
end
end
(0..NRBD_MIRRORS - 1).each do |i|
config.vm.define "#{LABEL_PREFIX}rbd-mirror#{i}" do |rbd_mirror|
rbd_mirror.vm.hostname = "#{LABEL_PREFIX}rbd-mirror#{i}"
if ASSIGN_STATIC_IP && !IPV6
rbd_mirror.vm.network :private_network,
:ip => "#{PUBLIC_SUBNET}.#{$last_ip_pub_digit+=1}"
end
# Virtualbox
rbd_mirror.vm.provider :virtualbox do |vb|
vb.customize ['modifyvm', :id, '--memory', "#{MEMORY}"]
end
# VMware
rbd_mirror.vm.provider :vmware_fusion do |v|
v.vmx['memsize'] = "#{MEMORY}"
end
# Libvirt
rbd_mirror.vm.provider :libvirt do |lv,override|
lv.memory = MEMORY
lv.random_hostname = true
if IPV6 then
override.vm.network :private_network,
:libvirt__ipv6_address => "#{PUBLIC_SUBNET}",
:libvirt__ipv6_prefix => "64",
:libvirt__dhcp_enabled => false,
:libvirt__forward_mode => "veryisolated",
:libvirt__network_name => "ipv6-public-network",
:ip => "#{PUBLIC_SUBNET}#{$last_ip_pub_digit+=1}",
:netmask => "64"
end
end
# Parallels
rbd_mirror.vm.provider "parallels" do |prl|
prl.name = "ceph-rbd-mirror#{i}"
prl.memory = "#{MEMORY}"
end
rbd_mirror.vm.provider :linode do |provider|
provider.label = rbd_mirror.vm.hostname
end
end
end
(0..NOSDS - 1).each do |i|
config.vm.define "#{LABEL_PREFIX}osd#{i}" do |osd|
osd.vm.hostname = "#{LABEL_PREFIX}osd#{i}"
if ASSIGN_STATIC_IP && !IPV6
osd.vm.network :private_network,
:ip => "#{PUBLIC_SUBNET}.#{$last_ip_pub_digit+=1}"
osd.vm.network :private_network,
:ip => "#{CLUSTER_SUBNET}.#{$last_ip_cluster_digit+=1}"
end
# Virtualbox
osd.vm.provider :virtualbox do |vb|
# Create our own controller for consistency and to remove VM dependency
unless File.exist?("disk-#{i}-0.vdi")
# Adding OSD Controller;
# once the first disk is there assuming we don't need to do this
vb.customize ['storagectl', :id,
'--name', 'OSD Controller',
'--add', 'scsi']
end
(0..2).each do |d|
unless File.exist?("disk-#{i}-#{d}.vdi")
vb.customize ['createhd',
'--filename', "disk-#{i}-#{d}",
'--size', '11000']
end
vb.customize ['storageattach', :id,
'--storagectl', 'OSD Controller',
'--port', 3 + d,
'--device', 0,
'--type', 'hdd',
'--medium', "disk-#{i}-#{d}.vdi"]
end
vb.customize ['modifyvm', :id, '--memory', "#{MEMORY}"]
end
# VMware
osd.vm.provider :vmware_fusion do |v|
(0..1).each do |d|
v.vmx["scsi0:#{d + 1}.present"] = 'TRUE'
v.vmx["scsi0:#{d + 1}.fileName"] =
create_vmdk("disk-#{i}-#{d}", '11000MB')
end
v.vmx['memsize'] = "#{MEMORY}"
end
# Libvirt
driverletters = ('a'..'z').to_a
osd.vm.provider :libvirt do |lv,override|
# always make /dev/sd{a/b/c} so that CI can ensure that
# virtualbox and libvirt will have the same devices to use for OSDs
(0..2).each do |d|
lv.storage :file, :device => "hd#{driverletters[d]}", :size => '50G', :bus => "ide"
end
lv.memory = MEMORY
lv.random_hostname = true
if IPV6 then
override.vm.network :private_network,
:libvirt__ipv6_address => "#{PUBLIC_SUBNET}",
:libvirt__ipv6_prefix => "64",
:libvirt__dhcp_enabled => false,
:libvirt__forward_mode => "veryisolated",
:libvirt__network_name => "ipv6-public-network",
:netmask => "64"
override.vm.network :private_network,
:libvirt__ipv6_address => "#{CLUSTER_SUBNET}",
:libvirt__ipv6_prefix => "64",
:libvirt__dhcp_enabled => false,
:libvirt__forward_mode => "veryisolated",
:libvirt__network_name => "ipv6-cluster-network",
:netmask => "64"
end
end
# Parallels
osd.vm.provider "parallels" do |prl|
prl.name = "ceph-osd#{i}"
prl.memory = "#{MEMORY}"
(0..1).each do |d|
prl.customize ["set", :id,
"--device-add",
"hdd",
"--iface",
"sata"]
end
end
osd.vm.provider :linode do |provider|
provider.label = osd.vm.hostname
end
# Run the provisioner after the last machine comes up
osd.vm.provision 'ansible', &ansible_provision if i == (NOSDS - 1)
end
end
end

43
ansible.cfg Normal file
View File

@ -0,0 +1,43 @@
# Comments inside this file must be set BEFORE the option.
# NOT after the option, otherwise the comment will be interpreted as a value to that option.
[defaults]
ansible_managed = Please do not change this file directly since it is managed by Ansible and will be overwritten
library = ./library
module_utils = ./module_utils
action_plugins = plugins/actions
callback_plugins = plugins/callback
filter_plugins = plugins/filter
roles_path = ./roles
# Be sure the user running Ansible has permissions on the logfile
log_path = $HOME/ansible/ansible.log
forks = 20
host_key_checking = False
gathering = smart
fact_caching = jsonfile
fact_caching_connection = $HOME/ansible/facts
fact_caching_timeout = 7200
nocows = 1
callback_allowlist = profile_tasks
#stdout_callback = yaml
callback_result_format = yaml
force_valid_group_names = ignore
inject_facts_as_vars = False
# Disable them in the context of https://review.openstack.org/#/c/469644
retry_files_enabled = False
# This is the default SSH timeout to use on connection attempts
# CI slaves are slow so by setting a higher value we can avoid the following error:
# Timeout (12s) waiting for privilege escalation prompt:
timeout = 60
[ssh_connection]
# see: https://github.com/ansible/ansible/issues/11536
control_path = %(directory)s/%%h-%%r-%%p
ssh_args = -o ControlMaster=auto -o ControlPersist=600s
pipelining = True
# Option to retry failed ssh executions if the failure is encountered in ssh itself
retries = 10

62
ceph-ansible.spec.in Normal file
View File

@ -0,0 +1,62 @@
%global commit @COMMIT@
%global shortcommit %(c=%{commit}; echo ${c:0:7})
Name: ceph-ansible
Version: @VERSION@
Release: @RELEASE@%{?dist}
Summary: Ansible playbooks for Ceph
# Some files have been copied from Ansible (GPLv3+). For example:
# plugins/actions/config_template.py
# roles/ceph-common/plugins/actions/config_template.py
License: ASL 2.0 and GPLv3+
URL: https://github.com/ceph/ceph-ansible
Source0: %{name}-%{version}-%{shortcommit}.tar.gz
Obsoletes: ceph-iscsi-ansible <= 1.5
BuildArch: noarch
BuildRequires: ansible-core >= 2.14
Requires: ansible-core >= 2.14
%if 0%{?rhel} == 7
BuildRequires: python2-devel
Requires: python2-netaddr
%else
BuildRequires: python3-devel
Requires: python3-netaddr
%endif
%description
Ansible playbooks for Ceph
%prep
%autosetup -p1
%build
%install
mkdir -p %{buildroot}%{_datarootdir}/ceph-ansible
for f in ansible.cfg *.yml *.sample group_vars roles library module_utils plugins infrastructure-playbooks; do
cp -a $f %{buildroot}%{_datarootdir}/ceph-ansible
done
pushd %{buildroot}%{_datarootdir}/ceph-ansible
# These untested playbooks are too unstable for users.
rm -r infrastructure-playbooks/untested-by-ci
%if ! 0%{?fedora} && ! 0%{?centos}
# remove ability to install ceph community version
rm roles/ceph-common/tasks/installs/redhat_{community,dev}_repository.yml
%endif
popd
%check
# Borrowed from upstream's .travis.yml:
ansible-playbook -i dummy-ansible-hosts test.yml --syntax-check
%files
%doc README.rst
%license LICENSE
%{_datarootdir}/ceph-ansible
%changelog

View File

@ -0,0 +1,96 @@
#!/usr/bin/env bash
set -e
shopt -s extglob # enable extended pattern matching features
#############
# VARIABLES #
#############
stable_branch=$1
commit=$2
bkp_branch_name=$3
bkp_branch_name_prefix=bkp
bkp_branch=$bkp_branch_name-$bkp_branch_name_prefix-$stable_branch
#############
# FUNCTIONS #
#############
verify_commit () {
for com in ${commit//,/ }; do
if [[ $(git cat-file -t "$com" 2>/dev/null) != commit ]]; then
echo "$com does not exist in your tree"
echo "Run 'git fetch origin main && git pull origin main'"
exit 1
fi
done
}
git_status () {
if [[ $(git status --porcelain | wc -l) -gt 0 ]]; then
echo "It looks like you have not committed changes:"
echo ""
git status --short
echo ""
echo ""
echo "Press ENTER to continue or Ctrl+c to break."
read -r
fi
}
checkout () {
git checkout --no-track -b "$bkp_branch" origin/"$stable_branch"
}
cherry_pick () {
local x
for com in ${commit//,/ }; do
x="$x $com"
done
# Trim the first white space and use an array
# Reference: https://github.com/koalaman/shellcheck/wiki/SC2086#exceptions
x=(${x##*( )})
git cherry-pick -x -s "${x[@]}"
}
push () {
git push origin "$bkp_branch"
}
create_pr () {
hub pull-request -h ceph/ceph-ansible:"$bkp_branch" -b "$stable_branch" -F -
}
cleanup () {
echo "Moving back to previous branch"
git checkout -
git branch -D "$bkp_branch"
}
test_args () {
if [ $# -lt 3 ]; then
echo "Please run the script like this: ./contrib/backport_to_stable_branch.sh STABLE_BRANCH_NAME COMMIT_SHA1 BACKPORT_BRANCH_NAME"
echo "We accept multiple commits as soon as they are commas-separated."
echo "e.g: ./contrib/backport_to_stable_branch.sh stable-2.2 6892670d317698771be7e96ce9032bc27d3fd1e5,8756c553cc8c213fc4996fc5202c7b687eb645a3 my-work"
exit 1
fi
}
########
# MAIN #
########
test_args "$@"
git_status
verify_commit
checkout
cherry_pick
push
create_pr <<MSG
${4} Backport of ${3} in $stable_branch
Backport of #${3} in $stable_branch
MSG
cleanup

View File

@ -0,0 +1,105 @@
#!/bin/bash
set -xe
# VARIABLES
BASEDIR=$(dirname "$0")
LOCAL_BRANCH=$(cd $BASEDIR && git rev-parse --abbrev-ref HEAD)
ROLES="ceph-common ceph-mon ceph-osd ceph-mds ceph-rgw ceph-fetch-keys ceph-rbd-mirror ceph-client ceph-container-common ceph-mgr ceph-defaults ceph-config"
# FUNCTIONS
function goto_basedir {
TOP_LEVEL=$(cd $BASEDIR && git rev-parse --show-toplevel)
if [[ "$(pwd)" != "$TOP_LEVEL" ]]; then
pushd "$TOP_LEVEL"
fi
}
function check_existing_remote {
if ! git remote show "$1" &> /dev/null; then
git remote add "$1" git@github.com:/ceph/ansible-"$1".git
fi
}
function pull_origin {
git pull origin main
}
function reset_hard_origin {
# let's bring everything back to normal
git checkout "$LOCAL_BRANCH"
git fetch origin --prune
git fetch --tags
git reset --hard origin/main
}
function check_git_status {
if [[ $(git status --porcelain | wc -l) -gt 0 ]]; then
echo "It looks like the following changes haven't been committed yet"
echo ""
git status --short
echo ""
echo ""
echo "Do you really want to continue?"
echo "Press ENTER to continue or CTRL C to break"
read -r
fi
}
function compare_tags {
# compare local tags (from https://github.com/ceph/ceph-ansible/) with distant tags (from https://github.com/ceph/ansible-ceph-$ROLE)
local tag_local
local tag_remote
for tag_local in $(git tag | grep -oE '^v[2-9].[0-9]*.[0-9]*$' | sort -t. -k 1,1n -k 2,2n -k 3,3n -k 4,4n); do
tags_array+=("$tag_local")
done
for tag_remote in $(git ls-remote --tags "$1" | grep -oE 'v[2-9].[0-9]*.[0-9]*$' | sort -t. -k 1,1n -k 2,2n -k 3,3n -k 4,4n); do
remote_tags_array+=("$tag_remote")
done
for i in "${tags_array[@]}"; do
skip=
for j in "${remote_tags_array[@]}"; do
[[ "$i" == "$j" ]] && { skip=1; break; }
done
[[ -n $skip ]] || tag_to_apply+=("$i")
done
}
# MAIN
goto_basedir
check_git_status
trap reset_hard_origin EXIT
trap reset_hard_origin ERR
pull_origin
for ROLE in $ROLES; do
# For readability we use 2 variables with the same content
# so we always make sure we 'push' to a remote and 'filter' a role
REMOTE=$ROLE
check_existing_remote "$REMOTE"
reset_hard_origin
# First we filter branches by rewriting main with the content of roles/$ROLE
# this gives us a new commit history
for BRANCH in $(git branch --list --remotes "origin/stable-*" "origin/main" "origin/ansible-1.9" | cut -d '/' -f2); do
git checkout -B "$BRANCH" origin/"$BRANCH"
# use || true to avoid exiting in case of 'Found nothing to rewrite'
git filter-branch -f --prune-empty --subdirectory-filter roles/"$ROLE" || true
git push -f "$REMOTE" "$BRANCH"
done
reset_hard_origin
# then we filter tags starting from version 2.0 and push them
compare_tags "$ROLE"
if [[ ${#tag_to_apply[@]} == 0 ]]; then
echo "No new tag to push."
continue
fi
for TAG in "${tag_to_apply[@]}"; do
# use || true to avoid exiting in case of 'Found nothing to rewrite'
git filter-branch -f --prune-empty --subdirectory-filter roles/"$ROLE" "$TAG" || true
git push -f "$REMOTE" "$TAG"
reset_hard_origin
done
done
trap - EXIT ERR
popd &> /dev/null

44
contrib/rundep.sample Normal file
View File

@ -0,0 +1,44 @@
#Package lines can be commented out with '#'
#
#boost-atomic
#boost-chrono
#boost-date-time
#boost-iostreams
#boost-program
#boost-random
#boost-regex
#boost-system
#boost-thread
#bzip2-libs
#cyrus-sasl-lib
#expat
#fcgi
#fuse-libs
#glibc
#keyutils-libs
#leveldb
#libaio
#libatomic_ops
#libattr
#libblkid
#libcap
#libcom_err
#libcurl
#libgcc
#libicu
#libidn
#libnghttp2
#libpsl
#libselinux
#libssh2
#libstdc++
#libunistring
#nss-softokn-freebl
#openldap
#openssl-libs
#pcre
#python-nose
#python-sphinx
#snappy
#systemd-libs
#zlib

27
contrib/rundep_installer.sh Executable file
View File

@ -0,0 +1,27 @@
#!/bin/bash -e
#
# Copyright (C) 2014, 2015 Red Hat <contact@redhat.com>
#
# Author: Daniel Lin <danielin@umich.edu>
#
# This library is free software; you can redistribute it and/or
# modify it under the terms of the GNU Lesser General Public
# License as published by the Free Software Foundation; either
# version 2.1 of the License, or (at your option) any later version.
#
if test -f /etc/redhat-release ; then
PACKAGE_INSTALLER=yum
elif type apt-get > /dev/null 2>&1 ; then
PACKAGE_INSTALLER=apt-get
else
echo "ERROR: Package Installer could not be determined"
exit 1
fi
while read p; do
if [[ $p =~ ^#.* ]] ; then
continue
fi
$PACKAGE_INSTALLER install $p -y
done < $1

73
contrib/snapshot_vms.sh Normal file
View File

@ -0,0 +1,73 @@
#!/bin/bash
create_snapshots() {
local pattern=$1
for vm in $(sudo virsh list --all | awk "/${pattern}/{print \$2}"); do
sudo virsh shutdown "${vm}"
wait_for_shutoff "${vm}"
sudo virsh snapshot-create "${vm}"
sudo virsh start "${vm}"
done
}
delete_snapshots() {
local pattern=$1
for vm in $(sudo virsh list --all | awk "/${pattern}/{print \$2}"); do
for snapshot in $(sudo virsh snapshot-list "${vm}" --name); do
echo "deleting snapshot ${snapshot} (vm: ${vm})"
sudo virsh snapshot-delete "${vm}" "${snapshot}"
done
done
}
revert_snapshots() {
local pattern=$1
for vm in $(sudo virsh list --all | awk "/${pattern}/{print \$2}"); do
echo "restoring last snapshot for ${vm}"
sudo virsh snapshot-revert "${vm}" --current
sudo virsh start "${vm}"
done
}
wait_for_shutoff() {
local vm=$1
local retries=60
local delay=2
until test "${retries}" -eq 0
do
echo "waiting for ${vm} to be shut off... #${retries}"
sleep "${delay}"
let "retries=$retries-1"
local current_state=$(sudo virsh domstate "${vm}")
test "${current_state}" == "shut off" && return
done
echo couldnt shutoff "${vm}"
exit 1
}
while :; do
case $1 in
-d|--delete)
delete_snapshots "$2"
exit
;;
-i|--interactive)
INTERACTIVE=TRUE
;;
-s|--snapshot)
create_snapshots "$2"
;;
-r|--revert)
revert_snapshots "$2"
;;
--)
shift
break
;;
*)
break
esac
shift
done

View File

@ -0,0 +1,30 @@
---
# DEPLOY CONTAINERIZED DAEMONS
docker: true
# DEFINE THE NUMBER OF VMS TO RUN
mon_vms: 1
osd_vms: 1
mds_vms: 0
rgw_vms: 0
nfs_vms: 0
rbd_mirror_vms: 0
client_vms: 0
mgr_vms: 0
# SUBNETS TO USE FOR THE VMS
public_subnet: 192.168.0
cluster_subnet: 192.168.1
# MEMORY
memory: 1024
disks: [ '/dev/sda', '/dev/sdb' ]
eth: 'enp0s8'
vagrant_box: centos/atomic-host
# The sync directory changes based on vagrant box
# Set to /home/vagrant/sync for Centos/7, /home/{ user }/vagrant for openstack and defaults to /vagrant
vagrant_sync_dir: /home/vagrant/sync
skip_tags: 'with_pkg'

View File

@ -0,0 +1,36 @@
---
vagrant_box: 'linode'
vagrant_box_url: 'https://github.com/displague/vagrant-linode/raw/master/box/linode.box'
# Set a label prefix for the machines in this cluster. (This is useful and necessary when running multiple clusters concurrently.)
#label_prefix: 'foo'
ssh_username: 'vagrant'
ssh_private_key_path: '~/.ssh/id_rsa'
cloud_distribution: 'CentOS 7'
cloud_datacenter: 'newark'
# Memory for each Linode instance, this determines price! See Linode plans.
memory: 2048
# The private network on Linode, you probably don't want to change this.
public_subnet: 192.168.0
cluster_subnet: 192.168.0
# DEFINE THE NUMBER OF VMS TO RUN
mon_vms: 3
osd_vms: 3
mds_vms: 1
rgw_vms: 0
nfs_vms: 0
rbd_mirror_vms: 0
client_vms: 0
# The sync directory changes based on vagrant box
# Set to /home/vagrant/sync for Centos/7, /home/{ user }/vagrant for openstack and defaults to /vagrant
# vagrant_sync_dir: /home/vagrant/sync
os_tuning_params:
- { name: fs.file-max, value: 26234859 }

View File

@ -0,0 +1,49 @@
---
# DEPLOY CONTAINERIZED DAEMONS
docker: true
# DEFINE THE NUMBER OF VMS TO RUN
mon_vms: 1
osd_vms: 1
mds_vms: 0
rgw_vms: 0
nfs_vms: 0
rbd_mirror_vms: 0
client_vms: 0
# SUBNET TO USE FOR THE VMS
# Use whatever private subnet your Openstack VMs are given
public_subnet: 172.17.72
cluster_subnet: 172.17.72
# For Openstack VMs, the disk will depend on what you are allocated
disks: [ '/dev/vdb' ]
# For Openstack VMs, the lan is usually eth0
eth: 'eth0'
# For Openstack VMs, choose the following box instead
vagrant_box: 'openstack'
# When using Atomic Hosts (RHEL or CentOS), uncomment the line below to skip package installation
#skip_tags: 'with_pkg'
# Set a label prefix for the machines in this cluster to differentiate
# between different concurrent clusters e.g. your OpenStack username
label_prefix: 'your-openstack-username'
# For deploying on OpenStack VMs uncomment these vars and assign values.
# You can use env vars for the values if it makes sense.
#ssh_username :
#ssh_private_key_path :
#os_openstack_auth_url :
#os_username :
#os_password :
#os_tenant_name :
#os_region :
#os_flavor :
#os_image :
#os_keypair_name :
#os_networks :
#os_floating_ip_pool :

148
dashboard.yml Normal file
View File

@ -0,0 +1,148 @@
---
- name: Deploy node_exporter
hosts:
- "{{ mon_group_name|default('mons') }}"
- "{{ osd_group_name|default('osds') }}"
- "{{ mds_group_name|default('mdss') }}"
- "{{ rgw_group_name|default('rgws') }}"
- "{{ mgr_group_name|default('mgrs') }}"
- "{{ rbdmirror_group_name|default('rbdmirrors') }}"
- "{{ nfs_group_name|default('nfss') }}"
- "{{ monitoring_group_name|default('monitoring') }}"
gather_facts: false
become: true
pre_tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
tags: ['ceph_update_config']
- name: Set ceph node exporter install 'In Progress'
run_once: true
ansible.builtin.set_stats:
data:
installer_phase_ceph_node_exporter:
status: "In Progress"
start: "{{ lookup('pipe', 'date +%Y%m%d%H%M%SZ') }}"
tasks:
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tags: ['ceph_update_config']
- name: Import ceph-container-engine
ansible.builtin.import_role:
name: ceph-container-engine
- name: Import ceph-container-common role
ansible.builtin.import_role:
name: ceph-container-common
tasks_from: registry
when:
- not containerized_deployment | bool
- ceph_docker_registry_auth | bool
- name: Import ceph-node-exporter role
ansible.builtin.import_role:
name: ceph-node-exporter
post_tasks:
- name: Set ceph node exporter install 'Complete'
run_once: true
ansible.builtin.set_stats:
data:
installer_phase_ceph_node_exporter:
status: "Complete"
end: "{{ lookup('pipe', 'date +%Y%m%d%H%M%SZ') }}"
- name: Deploy grafana and prometheus
hosts: "{{ monitoring_group_name | default('monitoring') }}"
gather_facts: false
become: true
pre_tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
tags: ['ceph_update_config']
- name: Set ceph grafana install 'In Progress'
run_once: true
ansible.builtin.set_stats:
data:
installer_phase_ceph_grafana:
status: "In Progress"
start: "{{ lookup('pipe', 'date +%Y%m%d%H%M%SZ') }}"
tasks:
# - ansible.builtin.import_role:
# name: ceph-facts
# tags: ['ceph_update_config']
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: grafana
tags: ['ceph_update_config']
- name: Import ceph-prometheus role
ansible.builtin.import_role:
name: ceph-prometheus
- name: Import ceph-grafana role
ansible.builtin.import_role:
name: ceph-grafana
post_tasks:
- name: Set ceph grafana install 'Complete'
run_once: true
ansible.builtin.set_stats:
data:
installer_phase_ceph_grafana:
status: "Complete"
end: "{{ lookup('pipe', 'date +%Y%m%d%H%M%SZ') }}"
# using groups[] here otherwise it can't fallback to the mon if there's no mgr group.
# adding an additional | default(omit) in case where no monitors are present (external ceph cluster)
- name: Deploy dashboard
hosts: "{{ groups['mgrs'] | default(groups['mons']) | default(omit) }}"
gather_facts: false
become: true
pre_tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
tags: ['ceph_update_config']
- name: Set ceph dashboard install 'In Progress'
run_once: true
ansible.builtin.set_stats:
data:
installer_phase_ceph_dashboard:
status: "In Progress"
start: "{{ lookup('pipe', 'date +%Y%m%d%H%M%SZ') }}"
tasks:
# - name: Import ceph-facts role
# ansible.builtin.import_role:
# name: ceph-facts
# tags: ['ceph_update_config']
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: grafana
tags: ['ceph_update_config']
- name: Import ceph-dashboard role
ansible.builtin.import_role:
name: ceph-dashboard
post_tasks:
- name: Set ceph dashboard install 'Complete'
run_once: true
ansible.builtin.set_stats:
data:
installer_phase_ceph_dashboard:
status: "Complete"
end: "{{ lookup('pipe', 'date +%Y%m%d%H%M%SZ') }}"

1
docs/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
build

20
docs/Makefile Normal file
View File

@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
SPHINXPROJ = ceph-ansible
SOURCEDIR = source
BUILDDIR = build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

View File

View File

155
docs/source/conf.py Normal file
View File

@ -0,0 +1,155 @@
# -*- coding: utf-8 -*-
#
# ceph-ansible documentation build configuration file, created by
# sphinx-quickstart on Wed Apr 5 11:55:38 2017.
#
# This file is execfile()d with the current directory set to its
# containing dir.
#
# Note that not all possible configuration values are present in this
# autogenerated file.
#
# All configuration values have a default; values that are commented out
# serve to show the default.
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))
# -- General configuration ------------------------------------------------
# If your documentation needs a minimal Sphinx version, state it here.
#
# needs_sphinx = '1.0'
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = []
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
#
# source_suffix = ['.rst', '.md']
source_suffix = '.rst'
# The root toctree document.
root_doc = 'glossary'
# General information about the project.
project = u'ceph-ansible'
copyright = u'2017-2018, Ceph team and individual contributors'
author = u'Ceph team and individual contributors'
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
version = u''
# The full version, including alpha/beta/rc tags.
release = u''
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This patterns also effect to html_static_path and html_extra_path
exclude_patterns = []
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
# If true, `todo` and `todoList` produce output, else they produce nothing.
todo_include_todos = False
# -- Options for HTML output ----------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'alabaster'
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#
# html_theme_options = {}
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
# -- Options for HTMLHelp output ------------------------------------------
# Output file base name for HTML help builder.
htmlhelp_basename = 'ceph-ansibledoc'
# -- Options for LaTeX output ---------------------------------------------
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
#
# 'papersize': 'letterpaper',
# The font size ('10pt', '11pt' or '12pt').
#
# 'pointsize': '10pt',
# Additional stuff for the LaTeX preamble.
#
# 'preamble': '',
# Latex figure (float) alignment
#
# 'figure_align': 'htbp',
}
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
(root_doc, 'ceph-ansible.tex', u'ceph-ansible Documentation',
u'Ceph team and individual contributors', 'manual'),
]
# -- Options for manual page output ---------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
(root_doc, 'ceph-ansible', u'ceph-ansible Documentation',
[author], 1)
]
# -- Options for Texinfo output -------------------------------------------
# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
(root_doc, 'ceph-ansible', u'ceph-ansible Documentation',
author, 'ceph-ansible', 'One line description of project.',
'Miscellaneous'),
]
master_doc = 'index'

View File

@ -0,0 +1,51 @@
Adding/Removing OSD(s) after a cluster is deployed is a common operation that should be straight-forward to achieve.
Adding osd(s)
-------------
Adding new OSD(s) on an existing host or adding a new OSD node can be achieved by running the main playbook with the ``--limit`` ansible option.
You basically need to update your host_vars/group_vars with the new hardware and/or the inventory host file with the new osd nodes being added.
The command used would be like following:
``ansible-playbook -vv -i <your-inventory> site-container.yml --limit <node>``
example:
.. code-block:: shell
$ cat hosts
[mons]
mon-node-1
mon-node-2
mon-node-3
[mgrs]
mon-node-1
mon-node-2
mon-node-3
[osds]
osd-node-1
osd-node-2
osd-node-3
osd-node-99
$ ansible-playbook -vv -i hosts site-container.yml --limit osd-node-99
Shrinking osd(s)
----------------
Shrinking OSDs can be done by using the shrink-osd.yml playbook provided in ``infrastructure-playbooks`` directory.
The variable ``osd_to_kill`` is a comma separated list of OSD IDs which must be passed to the playbook (passing it as an extra var is the easiest way).
The playbook will shrink all osds passed in ``osd_to_kill`` serially.
example:
.. code-block:: shell
$ ansible-playbook -vv -i hosts infrastructure-playbooks/shrink-osd.yml -e osd_to_kill=1,2,3

View File

@ -0,0 +1,15 @@
Purging the cluster
-------------------
ceph-ansible provides two playbooks in ``infrastructure-playbooks`` for purging a Ceph cluster: ``purge-cluster.yml`` and ``purge-container-cluster.yml``.
The names are pretty self-explanatory, ``purge-cluster.yml`` is intended to purge a non-containerized cluster whereas ``purge-container-cluster.yml`` is to purge a containerized cluster.
example:
.. code-block:: shell
$ ansible-playbook -vv -i hosts infrastructure-playbooks/purge-container-cluster.yml
.. note::
These playbooks aren't intended to be run with the ``--limit`` option.

View File

@ -0,0 +1,17 @@
Upgrading the ceph cluster
--------------------------
ceph-ansible provides a playbook in ``infrastructure-playbooks`` for upgrading a Ceph cluster: ``rolling_update.yml``.
This playbook could be used for both minor upgrades (X.Y to X.Z) or major upgrades (X to Y).
Before running a major upgrade you need to update the ceph-ansible version first.
example:
.. code-block:: shell
$ ansible-playbook -vv -i hosts infrastructure-playbooks/rolling_update.yml
.. note::
This playbook isn't intended to be run with the ``--limit`` ansible option.

102
docs/source/dev/index.rst Normal file
View File

@ -0,0 +1,102 @@
Contribution Guidelines
=======================
The repository centralises all the Ansible roles. The roles are all part of the Ansible Galaxy.
We love contribution and we love giving visibility to our contributors, this is why all the **commits must be signed-off**.
Mailing list
------------
Please register the mailing list at http://lists.ceph.com/listinfo.cgi/ceph-ansible-ceph.com.
IRC
---
Feel free to join us in the channel ``#ceph-ansible`` of the OFTC servers (https://www.oftc.net).
GitHub
------
The main GitHub account for the project is at https://github.com/ceph/ceph-ansible/.
Submit a patch
--------------
To start contributing just do:
.. code-block:: console
$ git checkout -b my-working-branch
$ # do your changes #
$ git add -p
If your change impacts a variable file in a role such as ``roles/ceph-common/defaults/main.yml``, you need to generate a ``group_vars`` file:
.. code-block:: console
$ ./generate_group_vars_sample.sh
You are finally ready to push your changes on GitHub:
.. code-block:: console
$ git commit -s
$ git push origin my-working-branch
Worked on a change and you don't want to resend a commit for a syntax fix?
.. code-block:: console
$ # do your syntax change #
$ git commit --amend
$ git push -f origin my-working-branch
Pull Request Testing
--------------------
Pull request testing is handled by Jenkins. All test must pass before your pull request will be merged.
All of tests that are running are listed in the GitHub UI and will list their current status.
If a test fails and you'd like to rerun it, comment on your pull request in the following format:
.. code-block:: none
jenkins test $scenario_name
For example:
.. code-block:: none
jenkins test centos-non_container-all_daemons
Backporting changes
-------------------
If a change should be backported to a ``stable-*`` Git branch:
- Mark your pull request with the GitHub label "Backport" so we don't lose track of it.
- Fetch the latest updates into your clone: ``git fetch``
- Determine the latest available stable branch:
``git branch -r --list "origin/stable-[0-9].[0-9]" | sort -r | sed 1q``
- Create a new local branch for your pull request, based on the stable branch:
``git checkout --no-track -b my-backported-change origin/stable-5.0``
- Cherry-pick your change: ``git cherry-pick -x (your-sha1)``
- Create a new pull request against the ``stable-5.0`` branch.
- Ensure that your pull request's title has the prefix "backport:", so it's clear
to reviewers what this is about.
- Add a comment in your backport pull request linking to the original (main) pull request.
All changes to the stable branches should land in main first, so we avoid
regressions.
Once this is done, one of the project maintainers will tag the tip of the
stable branch with your change. For example:
.. code-block:: console
$ git checkout stable-5.0
$ git pull --ff-only
$ git tag v5.0.12
$ git push origin v5.0.12

9
docs/source/glossary.rst Normal file
View File

@ -0,0 +1,9 @@
Glossary
========
.. toctree::
:maxdepth: 3
:caption: Contents:
index
testing/glossary

339
docs/source/index.rst Normal file
View File

@ -0,0 +1,339 @@
============
ceph-ansible
============
Ansible playbooks for Ceph, the distributed filesystem.
Installation
============
GitHub
------
You can install directly from the source on GitHub by following these steps:
- Clone the repository:
.. code-block:: console
$ git clone https://github.com/ceph/ceph-ansible.git
- Next, you must decide which branch of ``ceph-ansible`` you wish to use. There
are stable branches to choose from or you could use the main branch:
.. code-block:: console
$ git checkout $branch
- Next, use pip and the provided requirements.txt to install Ansible and other
needed Python libraries:
.. code-block:: console
$ pip install -r requirements.txt
.. _ansible-on-rhel-family:
Ansible on RHEL and CentOS
--------------------------
You can acquire Ansible on RHEL and CentOS by installing from `Ansible channel <https://access.redhat.com/articles/3174981>`_.
On RHEL:
.. code-block:: console
$ subscription-manager repos --enable=rhel-7-server-ansible-2-rpms
(CentOS does not use subscription-manager and already has "Extras" enabled by default.)
.. code-block:: console
$ sudo yum install ansible
Ansible on Ubuntu
-----------------
You can acquire Ansible on Ubuntu by using the `Ansible PPA <https://launchpad.net/~ansible/+archive/ubuntu/ansible>`_.
.. code-block:: console
$ sudo add-apt-repository ppa:ansible/ansible
$ sudo apt update
$ sudo apt install ansible
Ansible collections
-------------------
In order to install third-party collections that are required for ceph-ansible,
please run:
.. code-block:: console
$ ansible-galaxy install -r requirements.yml
Releases
========
The following branches should be used depending on your requirements. The ``stable-*``
branches have been QE tested and sometimes receive backport fixes throughout their lifecycle.
The ``main`` branch should be considered experimental and used with caution.
- ``stable-3.0`` Supports Ceph versions ``jewel`` and ``luminous``. This branch requires Ansible version ``2.4``.
- ``stable-3.1`` Supports Ceph versions ``luminous`` and ``mimic``. This branch requires Ansible version ``2.4``.
- ``stable-3.2`` Supports Ceph versions ``luminous`` and ``mimic``. This branch requires Ansible version ``2.6``.
- ``stable-4.0`` Supports Ceph version ``nautilus``. This branch requires Ansible version ``2.9``.
- ``stable-5.0`` Supports Ceph version ``octopus``. This branch requires Ansible version ``2.9``.
- ``stable-6.0`` Supports Ceph version ``pacific``. This branch requires Ansible version ``2.10``.
- ``stable-7.0`` Supports Ceph version ``quincy``. This branch requires Ansible version ``2.15``.
- ``main`` Supports the main (devel) branch of Ceph. This branch requires Ansible version ``2.15`` or ``2.16``.
.. NOTE:: ``stable-3.0`` and ``stable-3.1`` branches of ceph-ansible are deprecated and no longer maintained.
Configuration and Usage
=======================
This project assumes you have a basic knowledge of how Ansible works and have already prepared your hosts for
configuration by Ansible.
After you've cloned the ``ceph-ansible`` repository, selected your branch and installed Ansible then you'll need to create
your inventory file, playbook and configuration for your Ceph cluster.
Inventory
---------
The Ansible inventory file defines the hosts in your cluster and what roles each host plays in your Ceph cluster. The default
location for an inventory file is ``/etc/ansible/hosts`` but this file can be placed anywhere and used with the ``-i`` flag of
``ansible-playbook``.
An example inventory file would look like:
.. code-block:: ini
[mons]
mon1
mon2
mon3
[osds]
osd1
osd2
osd3
.. note::
For more information on Ansible inventories please refer to the Ansible documentation: http://docs.ansible.com/ansible/latest/intro_inventory.html
Playbook
--------
You must have a playbook to pass to the ``ansible-playbook`` command when deploying your cluster. There is a sample playbook at the root of the ``ceph-ansible``
project called ``site.yml.sample``. This playbook should work fine for most usages, but it does include by default every daemon group which might not be
appropriate for your cluster setup. Perform the following steps to prepare your playbook:
- Rename the sample playbook: ``mv site.yml.sample site.yml``
- Modify the playbook as necessary for the requirements of your cluster
.. note::
It's important the playbook you use is placed at the root of the ``ceph-ansible`` project. This is how Ansible will be able to find the roles that
``ceph-ansible`` provides.
Configuration Validation
------------------------
The ``ceph-ansible`` project provides config validation through the ``ceph-validate`` role. If you are using one of the provided playbooks this role will
be run early in the deployment as to ensure you've given ``ceph-ansible`` the correct config. This check is only making sure that you've provided the
proper config settings for your cluster, not that the values in them will produce a healthy cluster. For example, if you give an incorrect address for
``monitor_address`` then the mon will still fail to join the cluster.
An example of a validation failure might look like:
.. code-block:: console
TASK [ceph-validate : validate provided configuration] *************************
task path: /Users/andrewschoen/dev/ceph-ansible/roles/ceph-validate/tasks/main.yml:3
Wednesday 02 May 2018 13:48:16 -0500 (0:00:06.984) 0:00:18.803 *********
[ERROR]: [mon0] Validation failed for variable: osd_objectstore
[ERROR]: [mon0] Given value for osd_objectstore: foo
[ERROR]: [mon0] Reason: osd_objectstore must be either 'bluestore' or 'filestore'
fatal: [mon0]: FAILED! => {
"changed": false
}
Supported Validation
^^^^^^^^^^^^^^^^^^^^
The ``ceph-validate`` role currently supports validation of the proper config for the following
osd scenarios:
- ``collocated``
- ``non-collocated``
- ``lvm``
The following install options are also validated by the ``ceph-validate`` role:
- ``ceph_origin`` set to ``distro``
- ``ceph_origin`` set to ``repository``
- ``ceph_origin`` set to ``local``
- ``ceph_repository`` set to ``dev``
- ``ceph_repository`` set to ``community``
Installation methods
--------------------
Ceph can be installed through several methods.
.. toctree::
:maxdepth: 1
installation/methods
Configuration
-------------
The configuration for your Ceph cluster will be set by the use of ansible variables that ``ceph-ansible`` provides. All of these options and their default
values are defined in the ``group_vars/`` directory at the root of the ``ceph-ansible`` project. Ansible will use configuration in a ``group_vars/`` directory
that is relative to your inventory file or your playbook. Inside of the ``group_vars/`` directory there are many sample Ansible configuration files that relate
to each of the Ceph daemon groups by their filename. For example, the ``osds.yml.sample`` contains all the default configuration for the OSD daemons. The ``all.yml.sample``
file is a special ``group_vars`` file that applies to all hosts in your cluster.
.. note::
For more information on setting group or host specific configuration refer to the Ansible documentation: http://docs.ansible.com/ansible/latest/intro_inventory.html#splitting-out-host-and-group-specific-data
At the most basic level you must tell ``ceph-ansible`` what version of Ceph you wish to install, the method of installation, your clusters network settings and
how you want your OSDs configured. To begin your configuration rename each file in ``group_vars/`` you wish to use so that it does not include the ``.sample``
at the end of the filename, uncomment the options you wish to change and provide your own value.
An example configuration that deploys the upstream ``octopus`` version of Ceph with lvm batch method would look like this in ``group_vars/all.yml``:
.. code-block:: yaml
ceph_origin: repository
ceph_repository: community
public_network: "192.168.3.0/24"
cluster_network: "192.168.4.0/24"
devices:
- '/dev/sda'
- '/dev/sdb'
The following config options are required to be changed on all installations but there could be other required options depending on your OSD scenario
selection or other aspects of your cluster.
- ``ceph_origin``
- ``public_network``
When deploying RGW instance(s) you are required to set the ``radosgw_interface`` or ``radosgw_address`` config option.
``ceph.conf`` Configuration File
---------------------------------
The supported method for defining your ``ceph.conf`` is to use the ``ceph_conf_overrides`` variable. This allows you to specify configuration options using
an INI format. This variable can be used to override sections already defined in ``ceph.conf`` (see: ``roles/ceph-config/templates/ceph.conf.j2``) or to provide
new configuration options.
The following sections in ``ceph.conf`` are supported:
* ``[global]``
* ``[mon]``
* ``[osd]``
* ``[mds]``
* ``[client.rgw.{instance_name}]``
An example:
.. code-block:: yaml
ceph_conf_overrides:
global:
foo: 1234
bar: 5678
osd:
osd_mkfs_type: ext4
.. note::
We will no longer accept pull requests that modify the ``ceph.conf`` template unless it helps the deployment. For simple configuration tweaks
please use the ``ceph_conf_overrides`` variable.
Full documentation for configuring each of the Ceph daemon types are in the following sections.
OSD Configuration
-----------------
OSD configuration was used to be set by selecting an OSD scenario and providing the configuration needed for
that scenario. As of nautilus in stable-4.0, the only scenarios available is ``lvm``.
.. toctree::
:maxdepth: 1
osds/scenarios
Day-2 Operations
----------------
ceph-ansible provides a set of playbook in ``infrastructure-playbooks`` directory in order to perform some basic day-2 operations.
.. toctree::
:maxdepth: 1
day-2/osds
day-2/purge
day-2/upgrade
RBD Mirroring
-------------
Ceph-ansible provides the role ``ceph-rbd-mirror`` that can setup an RBD mirror replication.
.. toctree::
:maxdepth: 1
rbdmirror/index
Contribution
============
See the following section for guidelines on how to contribute to ``ceph-ansible``.
.. toctree::
:maxdepth: 1
dev/index
Testing
=======
Documentation for writing functional testing scenarios for ``ceph-ansible``.
* :doc:`Testing with ceph-ansible <testing/index>`
* :doc:`Glossary <testing/glossary>`
Demos
=====
Vagrant Demo
------------
Deployment from scratch on vagrant machines: https://youtu.be/E8-96NamLDo
Bare metal demo
---------------
Deployment from scratch on bare metal machines: https://youtu.be/dv_PEp9qAqg

View File

@ -0,0 +1,64 @@
Containerized deployment
========================
Ceph-ansible supports docker and podman only in order to deploy Ceph in a containerized context.
Configuration and Usage
-----------------------
To deploy ceph in containers, you will need to set the ``containerized_deployment`` variable to ``true`` and use the site-container.yml.sample playbook.
.. code-block:: yaml
containerized_deployment: true
The ``ceph_origin`` and ``ceph_repository`` variables aren't needed anymore in containerized deployment and are ignored.
.. code-block:: console
$ ansible-playbook site-container.yml.sample
.. note::
The infrastructure playbooks are working for both non containerized and containerized deployment.
Custom container image
----------------------
You can configure your own container register, image and tag by using the ``ceph_docker_registry``, ``ceph_docker_image`` and ``ceph_docker_image_tag`` variables.
.. code-block:: yaml
ceph_docker_registry: quay.io
ceph_docker_image: ceph/ceph
ceph_docker_image_tag: v19
.. note::
``ceph_docker_image`` should have both image namespace and image name concatenated and separated by a slash character.
``ceph_docker_image_tag`` should be set to a fixed tag, not to any "latest" tags unless you know what you are doing. Using a "latest" tag
might make the playbook restart all the daemons deployed in your cluster since these tags are intended to be updated periodically.
Container registry authentication
---------------------------------
When using a container registry with authentication then you need to set the ``ceph_docker_registry_auth`` variable to ``true`` and provide the credentials via the
``ceph_docker_registry_username`` and ``ceph_docker_registry_password`` variables
.. code-block:: yaml
ceph_docker_registry_auth: true
ceph_docker_registry_username: foo
ceph_docker_registry_password: bar
Container registry behind a proxy
---------------------------------
When using a container registry reachable via a http(s) proxy then you need to set the ``ceph_docker_http_proxy`` and/or ``ceph_docker_https_proxy`` variables. If you need
to exclude some host for the proxy configuration to can use the ``ceph_docker_no_proxy`` variable.
.. code-block:: yaml
ceph_docker_http_proxy: http://192.168.42.100:8080
ceph_docker_https_proxy: https://192.168.42.100:8080

View File

@ -0,0 +1,12 @@
Installation methods
====================
ceph-ansible can deploy Ceph either in a non-containerized context (via packages) or in a containerized context using ceph-container images.
.. toctree::
:maxdepth: 1
non-containerized
containerized
The difference here is that you don't have the rbd command on the host when using the containerized deployment so everything related to ceph needs to be executed within a container. So in the case there is software like e.g. Open Nebula which requires that the rbd command is accessible directly on the host (non-containerized) then you have to install the rbd command by yourself on those servers outside of containers (or make sure that this software somehow runs within containers as well and that it can access rbd).

View File

@ -0,0 +1,58 @@
Non containerized deployment
============================
The following are all of the available options for the installing Ceph through different channels.
We support 3 main installation methods, all managed by the ``ceph_origin`` variable:
- ``repository``: means that you will get Ceph installed through a new repository. Later below choose between ``community`` or ``dev``. These options will be exposed through the ``ceph_repository`` variable.
- ``distro``: means that no separate repo file will be added and you will get whatever version of Ceph is included in your Linux distro.
- ``local``: means that the Ceph binaries will be copied over from the local machine (not well tested, use at your own risk)
Origin: Repository
------------------
If ``ceph_origin`` is set to ``repository``, you now have the choice between a couple of repositories controlled by the ``ceph_repository`` option:
- ``community``: fetches packages from http://download.ceph.com, the official community Ceph repositories
- ``dev``: fetches packages from shaman, a gitbuilder based package system
- ``uca``: fetches packages from Ubuntu Cloud Archive
- ``custom``: fetches packages from a specific repository
Community repository
~~~~~~~~~~~~~~~~~~~~
If ``ceph_repository`` is set to ``community``, packages you will be by default installed from http://download.ceph.com, this can be changed by tweaking ``ceph_mirror``.
UCA repository
~~~~~~~~~~~~~~
If ``ceph_repository`` is set to ``uca``, packages you will be by default installed from http://ubuntu-cloud.archive.canonical.com/ubuntu, this can be changed by tweaking ``ceph_stable_repo_uca``.
You can also decide which OpenStack version the Ceph packages should come from by tweaking ``ceph_stable_openstack_release_uca``.
For example, ``ceph_stable_openstack_release_uca: queens``.
Dev repository
~~~~~~~~~~~~~~
If ``ceph_repository`` is set to ``dev``, packages you will be by default installed from https://shaman.ceph.com/, this can not be tweaked.
You can obviously decide which branch to install with the help of ``ceph_dev_branch`` (defaults to 'main').
Additionally, you can specify a SHA1 with ``ceph_dev_sha1``, defaults to 'latest' (as in latest built).
Custom repository
~~~~~~~~~~~~~~~~~
If ``ceph_repository`` is set to ``custom``, packages you will be by default installed from a desired repository.
This repository is specified with ``ceph_custom_repo``, e.g: ``ceph_custom_repo: https://server.domain.com/ceph-custom-repo``.
Origin: Distro
--------------
If ``ceph_origin`` is set to ``distro``, no separate repo file will be added and you will get whatever version of Ceph is included in your Linux distro.
Origin: Local
-------------
If ``ceph_origin`` is set to ``local``, the ceph binaries will be copied over from the local machine (not well tested, use at your own risk)

View File

@ -0,0 +1,221 @@
OSD Scenario
============
As of stable-4.0, the following scenarios are not supported anymore since they are associated to ``ceph-disk``:
* `collocated`
* `non-collocated`
Since the Ceph luminous release, it is preferred to use the :ref:`lvm scenario
<osd_scenario_lvm>` that uses the ``ceph-volume`` provisioning tool. Any other
scenario will cause deprecation warnings.
``ceph-disk`` was deprecated during the ceph-ansible 3.2 cycle and has been removed entirely from Ceph itself in the Nautilus version.
At present (starting from stable-4.0), there is only one scenario, which defaults to ``lvm``, see:
* :ref:`lvm <osd_scenario_lvm>`
So there is no need to configure ``osd_scenario`` anymore, it defaults to ``lvm``.
The ``lvm`` scenario mentioned above support both containerized and non-containerized cluster.
As a reminder, deploying a containerized cluster can be done by setting ``containerized_deployment``
to ``True``.
If you want to skip OSD creation during a ``ceph-ansible run``
(e.g. because you have already provisioned your OSDs but disk IDs have
changed), you can skip the ``prepare_osd`` tag i.e. by specifying
``--skip-tags prepare_osd`` on the ``ansible-playbook`` command line.
.. _osd_scenario_lvm:
lvm
---
This OSD scenario uses ``ceph-volume`` to create OSDs, primarily using LVM, and
is only available when the Ceph release is luminous or newer.
It is automatically enabled.
Other (optional) supported settings:
- ``dmcrypt``: Enable Ceph's encryption on OSDs using ``dmcrypt``.
Defaults to ``false`` if unset.
- ``osds_per_device``: Provision more than 1 OSD (the default if unset) per device.
Simple configuration
^^^^^^^^^^^^^^^^^^^^
With this approach, most of the decisions on how devices are configured to
provision an OSD are made by the Ceph tooling (``ceph-volume lvm batch`` in
this case). There is almost no room to modify how the OSD is composed given an
input of devices.
To use this configuration, the ``devices`` option must be populated with the
raw device paths that will be used to provision the OSDs.
.. note:: Raw devices must be "clean", without a gpt partition table, or
logical volumes present.
For example, for a node that has ``/dev/sda`` and ``/dev/sdb`` intended for
Ceph usage, the configuration would be:
.. code-block:: yaml
devices:
- /dev/sda
- /dev/sdb
In the above case, if both devices are spinning drives, 2 OSDs would be
created, each with its own collocated journal.
Other provisioning strategies are possible, by mixing spinning and solid state
devices, for example:
.. code-block:: yaml
devices:
- /dev/sda
- /dev/sdb
- /dev/nvme0n1
Similar to the initial example, this would end up producing 2 OSDs, but data
would be placed on the slower spinning drives (``/dev/sda``, and ``/dev/sdb``)
and journals would be placed on the faster solid state device ``/dev/nvme0n1``.
The ``ceph-volume`` tool describes this in detail in
`the "batch" subcommand section <https://docs.ceph.com/en/latest/ceph-volume/lvm/batch/>`_
This option can also be used with ``osd_auto_discovery``, meaning that you do not need to populate
``devices`` directly and any appropriate devices found by ansible will be used instead.
.. code-block:: yaml
osd_auto_discovery: true
Other (optional) supported settings:
- ``crush_device_class``: Sets the CRUSH device class for all OSDs created with this
method (it is not possible to have a per-OSD CRUSH device class using the *simple*
configuration approach). Values *must be* a string, like
``crush_device_class: "ssd"``
Advanced configuration
^^^^^^^^^^^^^^^^^^^^^^
This configuration is useful when more granular control is wanted when setting
up devices and how they should be arranged to provision an OSD. It requires an
existing setup of volume groups and logical volumes (``ceph-volume`` will **not**
create these).
To use this configuration, the ``lvm_volumes`` option must be populated with
logical volumes and volume groups. Additionally, absolute paths to partitions
*can* be used for ``journal``, ``block.db``, and ``block.wal``.
.. note:: This configuration uses ``ceph-volume lvm create`` to provision OSDs
Supported ``lvm_volumes`` configuration settings:
- ``data``: The logical volume name or full path to a raw device (an LV will be
created using 100% of the raw device)
- ``data_vg``: The volume group name, **required** if ``data`` is a logical volume.
- ``crush_device_class``: CRUSH device class name for the resulting OSD, allows
setting set the device class for each OSD, unlike the global ``crush_device_class``
that sets them for all OSDs.
.. note:: If you wish to set the ``crush_device_class`` for the OSDs
when using ``devices`` you must set it using the global ``crush_device_class``
option as shown above. There is no way to define a specific CRUSH device class
per OSD when using ``devices`` like there is for ``lvm_volumes``.
.. warning:: Each entry must be unique, duplicate values are not allowed
``bluestore`` objectstore variables:
- ``db``: The logical volume name or full path to a partition.
- ``db_vg``: The volume group name, **required** if ``db`` is a logical volume.
- ``wal``: The logical volume name or full path to a partition.
- ``wal_vg``: The volume group name, **required** if ``wal`` is a logical volume.
.. note:: These ``bluestore`` variables are optional optimizations. Bluestore's
``db`` and ``wal`` will only benefit from faster devices. It is possible to
create a bluestore OSD with a single raw device.
.. warning:: Each entry must be unique, duplicate values are not allowed
``bluestore`` example using raw devices:
.. code-block:: yaml
osd_objectstore: bluestore
lvm_volumes:
- data: /dev/sda
- data: /dev/sdb
.. note:: Volume groups and logical volumes will be created in this case,
utilizing 100% of the devices.
``bluestore`` example with logical volumes:
.. code-block:: yaml
osd_objectstore: bluestore
lvm_volumes:
- data: data-lv1
data_vg: data-vg1
- data: data-lv2
data_vg: data-vg2
.. note:: Volume groups and logical volumes must exist.
``bluestore`` example defining ``wal`` and ``db`` logical volumes:
.. code-block:: yaml
osd_objectstore: bluestore
lvm_volumes:
- data: data-lv1
data_vg: data-vg1
db: db-lv1
db_vg: db-vg1
wal: wal-lv1
wal_vg: wal-vg1
- data: data-lv2
data_vg: data-vg2
db: db-lv2
db_vg: db-vg2
wal: wal-lv2
wal_vg: wal-vg2
.. note:: Volume groups and logical volumes must exist.
``filestore`` example with logical volumes:
.. code-block:: yaml
osd_objectstore: filestore
lvm_volumes:
- data: data-lv1
data_vg: data-vg1
journal: journal-lv1
journal_vg: journal-vg1
- data: data-lv2
data_vg: data-vg2
journal: journal-lv2
journal_vg: journal-vg2
.. note:: Volume groups and logical volumes must exist.

View File

@ -0,0 +1,60 @@
RBD Mirroring
=============
There's not so much to do from the primary cluster side in order to setup an RBD mirror replication.
``ceph_rbd_mirror_configure`` has to be set to ``true`` to make ceph-ansible create the mirrored pool
defined in ``ceph_rbd_mirror_pool`` and the keyring that is going to be used to add the rbd mirror peer.
group_vars from the primary cluster:
.. code-block:: yaml
ceph_rbd_mirror_configure: true
ceph_rbd_mirror_pool: rbd
Optionnally, you can tell ceph-ansible to set the name and the secret of the keyring you want to create:
.. code-block:: yaml
ceph_rbd_mirror_local_user: client.rbd-mirror-peer # 'client.rbd-mirror-peer' is the default value.
ceph_rbd_mirror_local_user_secret: AQC+eM1iKKBXFBAAVpunJvqpkodHSYmljCFCnw==
This secret will be needed to add the rbd mirror peer from the secondary cluster.
If you do not enforce it as shown above, you can get it from a monitor by running the following command:
``ceph auth get {{ ceph_rbd_mirror_local_user }}``
.. code-block:: shell
$ sudo ceph auth get client.rbd-mirror-peer
Once your variables are defined, you can run the playbook (you might want to run with --limit option):
.. code-block:: shell
$ ansible-playbook -vv -i hosts site-container.yml --limit rbdmirror0
The configuration of the rbd mirror replication strictly speaking is done on the secondary cluster.
The rbd-mirror daemon pulls the data from the primary cluster. This is where the rbd mirror peer addition has to be done.
The configuration is similar with what was done on the primary cluster, it just needs few additional variables.
``ceph_rbd_mirror_remote_user`` : This user must match the name defined in the variable ``ceph_rbd_mirror_local_user`` from the primary cluster.
``ceph_rbd_mirror_remote_mon_hosts`` : This must a comma separated list of the monitor addresses from the primary cluster.
``ceph_rbd_mirror_remote_key`` : This must be the same value as the user (``{{ ceph_rbd_mirror_local_user }}``) keyring secret from the primary cluster.
group_vars from the secondary cluster:
.. code-block:: yaml
ceph_rbd_mirror_configure: true
ceph_rbd_mirror_pool: rbd
ceph_rbd_mirror_remote_user: client.rbd-mirror-peer # This must match the value defined in {{ ceph_rbd_mirror_local_user }} on primary cluster.
ceph_rbd_mirror_remote_mon_hosts: 1.2.3.4
ceph_rbd_mirror_remote_key: AQC+eM1iKKBXFBAAVpunJvqpkodHSYmljCFCnw== # This must match the secret of the registered keyring of the user defined in {{ ceph_rbd_mirror_local_user }} on primary cluster.
Once you variables are defined, you can run the playbook (you might want to run with --limit option):
.. code-block:: shell
$ ansible-playbook -vv -i hosts site-container.yml --limit rbdmirror0

View File

@ -0,0 +1,4 @@
.. _development:
ceph-ansible testing for development
====================================

View File

@ -0,0 +1,14 @@
Glossary
========
.. toctree::
:maxdepth: 1
index
running.rst
development.rst
scenarios.rst
modifying.rst
layout.rst
tests.rst
tox.rst

View File

@ -0,0 +1,38 @@
.. _testing:
Testing
=======
``ceph-ansible`` has the ability to test different scenarios (collocated journals
or dmcrypt OSDs for example) in an isolated, repeatable, and easy way.
These tests can run locally with VirtualBox or via libvirt if available, which
removes the need to solely rely on a CI system like Jenkins to verify
a behavior.
* **Getting started:**
* :doc:`Running a Test Scenario <running>`
* :ref:`dependencies`
* **Configuration and structure:**
* :ref:`layout`
* :ref:`test_files`
* :ref:`scenario_files`
* :ref:`scenario_wiring`
* **Adding or modifying tests:**
* :ref:`test_conventions`
* :ref:`testinfra`
* **Adding or modifying a scenario:**
* :ref:`scenario_conventions`
* :ref:`scenario_environment_configuration`
* :ref:`scenario_ansible_configuration`
* **Custom/development repositories and packages:**
* :ref:`tox_environment_variables`

View File

@ -0,0 +1,60 @@
.. _layout:
Layout and conventions
----------------------
Test files and directories follow a few conventions, which makes it easy to
create (or expect) certain interactions between tests and scenarios.
All tests are in the ``tests`` directory. Scenarios are defined in
``tests/functional/`` and use the following convention for directory
structure:
.. code-block:: none
tests/functional/<distro>/<distro version>/<scenario name>/
For example: ``tests/functional/centos/7/journal-collocation``
Within a test scenario there are a few files that define what that specific
scenario needs for the tests, like how many OSD nodes or MON nodes. Tls
At the very least, a scenario will need these files:
* ``Vagrantfile``: must be symlinked from the root directory of the project
* ``hosts``: An Ansible hosts file that defines the machines part of the
cluster
* ``group_vars/all``: if any modifications are needed for deployment, this
would override them. Additionally, further customizations can be done. For
example, for OSDs that would mean adding ``group_vars/osds``
* ``vagrant_variables.yml``: Defines the actual environment for the test, where
machines, networks, disks, linux distro/version, can be defined.
.. _test_conventions:
Conventions
-----------
Python test files (unlike scenarios) rely on paths to *map* where they belong. For
example, a file that should only test monitor nodes would live in
``ceph-ansible/tests/functional/tests/mon/``. Internally, the test runner
(``py.test``) will *mark* these as tests that should run on a monitor only.
Since the configuration of a scenario already defines what node has a given
role, then it is easier for the system to only run tests that belong to
a particular node type.
The current convention is a bit manual, with initial path support for:
* mon
* osd
* mds
* rgw
* journal_collocation
* all/any (if none of the above are matched, then these are run on any host)
.. _testinfra:
``testinfra``
-------------

View File

@ -0,0 +1,4 @@
.. _modifying:
Modifying (or adding) tests
===========================

View File

@ -0,0 +1,169 @@
.. _running_tests:
Running Tests
=============
Although tests run continuously in CI, a lot of effort was put into making it
easy to run in any environment, as long as a couple of requirements are met.
.. _dependencies:
Dependencies
------------
There are some Python dependencies, which are listed in a ``requirements.txt``
file within the ``tests/`` directory. These are meant to be installed using
Python install tools (pip in this case):
.. code-block:: console
pip install -r tests/requirements.txt
For virtualization, either libvirt or VirtualBox is needed (there is native
support from the harness for both). This makes the test harness even more
flexible as most platforms will be covered by either VirtualBox or libvirt.
.. _running_a_scenario:
Running a scenario
------------------
Tests are driven by ``tox``, a command line tool to run a matrix of tests defined in
a configuration file (``tox.ini`` in this case at the root of the project).
For a thorough description of a scenario see :ref:`test_scenarios`.
To run a single scenario, make sure it is available (should be defined from
``tox.ini``) by listing them:
.. code-block:: console
tox -l
In this example, we will use the ``luminous-ansible2.4-xenial_cluster`` one. The
harness defaults to ``VirtualBox`` as the backend, so if you have that
installed in your system then this command should just work:
.. code-block:: console
tox -e luminous-ansible2.4-xenial_cluster
And for libvirt it would be:
.. code-block:: console
tox -e luminous-ansible2.4-xenial_cluster -- --provider=libvirt
.. warning::
Depending on the type of scenario and resources available, running
these tests locally in a personal computer can be very resource intensive.
.. note::
Most test runs take between 20 and 40 minutes depending on system
resources
The command should bring up the machines needed for the test, provision them
with ``ceph-ansible``, run the tests, and tear the whole environment down at the
end.
The output would look something similar to this trimmed version:
.. code-block:: console
luminous-ansible2.4-xenial_cluster create: /Users/alfredo/python/upstream/ceph-ansible/.tox/luminous-ansible2.4-xenial_cluster
luminous-ansible2.4-xenial_cluster installdeps: ansible==2.4.2, -r/Users/alfredo/python/upstream/ceph-ansible/tests/requirements.txt
luminous-ansible2.4-xenial_cluster runtests: commands[0] | vagrant up --no-provision --provider=virtualbox
Bringing machine 'client0' up with 'virtualbox' provider...
Bringing machine 'rgw0' up with 'virtualbox' provider...
Bringing machine 'mds0' up with 'virtualbox' provider...
Bringing machine 'mon0' up with 'virtualbox' provider...
Bringing machine 'mon1' up with 'virtualbox' provider...
Bringing machine 'mon2' up with 'virtualbox' provider...
Bringing machine 'osd0' up with 'virtualbox' provider...
...
After all the nodes are up, ``ceph-ansible`` will provision them, and run the
playbook(s):
.. code-block:: console
...
PLAY RECAP *********************************************************************
client0 : ok=4 changed=0 unreachable=0 failed=0
mds0 : ok=4 changed=0 unreachable=0 failed=0
mon0 : ok=4 changed=0 unreachable=0 failed=0
mon1 : ok=4 changed=0 unreachable=0 failed=0
mon2 : ok=4 changed=0 unreachable=0 failed=0
osd0 : ok=4 changed=0 unreachable=0 failed=0
rgw0 : ok=4 changed=0 unreachable=0 failed=0
...
Once the whole environment is all running the tests will be sent out to the
hosts, with output similar to this:
.. code-block:: console
luminous-ansible2.4-xenial_cluster runtests: commands[4] | testinfra -n 4 --sudo -v --connection=ansible --ansible-inventory=/Users/alfredo/python/upstream/ceph-ansible/tests/functional/ubuntu/16.04/cluster/hosts /Users/alfredo/python/upstream/ceph-ansible/tests/functional/tests
============================ test session starts ===========================
platform darwin -- Python 2.7.8, pytest-3.0.7, py-1.4.33, pluggy-0.4.0 -- /Users/alfredo/python/upstream/ceph-ansible/.tox/luminous-ansible2.4-xenial_cluster/bin/python
cachedir: ../../../../.cache
rootdir: /Users/alfredo/python/upstream/ceph-ansible/tests, inifile: pytest.ini
plugins: testinfra-1.5.4, xdist-1.15.0
[gw0] darwin Python 2.7.8 cwd: /Users/alfredo/python/upstream/ceph-ansible/tests/functional/ubuntu/16.04/cluster
[gw1] darwin Python 2.7.8 cwd: /Users/alfredo/python/upstream/ceph-ansible/tests/functional/ubuntu/16.04/cluster
[gw2] darwin Python 2.7.8 cwd: /Users/alfredo/python/upstream/ceph-ansible/tests/functional/ubuntu/16.04/cluster
[gw3] darwin Python 2.7.8 cwd: /Users/alfredo/python/upstream/ceph-ansible/tests/functional/ubuntu/16.04/cluster
[gw0] Python 2.7.8 (v2.7.8:ee879c0ffa11, Jun 29 2014, 21:07:35) -- [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
[gw1] Python 2.7.8 (v2.7.8:ee879c0ffa11, Jun 29 2014, 21:07:35) -- [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
[gw2] Python 2.7.8 (v2.7.8:ee879c0ffa11, Jun 29 2014, 21:07:35) -- [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
[gw3] Python 2.7.8 (v2.7.8:ee879c0ffa11, Jun 29 2014, 21:07:35) -- [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
gw0 [154] / gw1 [154] / gw2 [154] / gw3 [154]
scheduling tests via LoadScheduling
../../../tests/test_install.py::TestInstall::test_ceph_dir_exists[ansible:/mon0]
../../../tests/test_install.py::TestInstall::test_ceph_dir_is_a_directory[ansible:/mon0]
../../../tests/test_install.py::TestInstall::test_ceph_conf_is_a_file[ansible:/mon0]
../../../tests/test_install.py::TestInstall::test_ceph_dir_is_a_directory[ansible:/mon1]
[gw2] PASSED ../../../tests/test_install.py::TestCephConf::test_ceph_config_has_mon_host_line[ansible:/mon0]
../../../tests/test_install.py::TestInstall::test_ceph_conf_exists[ansible:/mon1]
[gw3] PASSED ../../../tests/test_install.py::TestCephConf::test_mon_host_line_has_correct_value[ansible:/mon0]
../../../tests/test_install.py::TestInstall::test_ceph_conf_is_a_file[ansible:/mon1]
[gw1] PASSED ../../../tests/test_install.py::TestInstall::test_ceph_command_exists[ansible:/mon1]
../../../tests/test_install.py::TestCephConf::test_mon_host_line_has_correct_value[ansible:/mon1]
...
Finally the whole environment gets torn down:
.. code-block:: console
luminous-ansible2.4-xenial_cluster runtests: commands[5] | vagrant destroy --force
==> osd0: Forcing shutdown of VM...
==> osd0: Destroying VM and associated drives...
==> mon2: Forcing shutdown of VM...
==> mon2: Destroying VM and associated drives...
==> mon1: Forcing shutdown of VM...
==> mon1: Destroying VM and associated drives...
==> mon0: Forcing shutdown of VM...
==> mon0: Destroying VM and associated drives...
==> mds0: Forcing shutdown of VM...
==> mds0: Destroying VM and associated drives...
==> rgw0: Forcing shutdown of VM...
==> rgw0: Destroying VM and associated drives...
==> client0: Forcing shutdown of VM...
==> client0: Destroying VM and associated drives...
And a brief summary of the scenario(s) that ran is displayed:
.. code-block:: console
________________________________________________ summary _________________________________________________
luminous-ansible2.4-xenial_cluster: commands succeeded
congratulations :)

View File

@ -0,0 +1,211 @@
.. _test_scenarios:
Test Scenarios
==============
Scenarios are distinct environments that describe a Ceph deployment and
configuration. Scenarios are isolated as well, and define what machines are
needed aside from any ``ceph-ansible`` configuration.
.. _scenario_files:
Scenario Files
==============
The scenario is described in a ``vagrant_variables.yml`` file, which is
consumed by ``Vagrant`` when bringing up an environment.
This yaml file is loaded in the ``Vagrantfile`` so that the settings can be
used to bring up the boxes and pass some configuration to ansible when running.
.. note::
The basic layout of a scenario is covered in :ref:`layout`.
There are just a handful of required files, this is the most basic layout.
There are just a handful of required files, these sections will cover the
required (most basic) ones. Alternatively, other ``ceph-ansible`` files can be
added to customize the behavior of a scenario deployment.
.. _vagrant_variables:
``vagrant_variables.yml``
-------------------------
There are a few sections in the ``vagrant_variables.yml`` file which are easy
to follow (most of them are 1 line settings).
* **docker**: (bool) Indicates if the scenario will deploy Docker daemons
* **VMS**: (int) These integer values are just a count of how many machines will be
needed. Each supported type is listed, defaulting to 0:
.. code-block:: yaml
mon_vms: 0
osd_vms: 0
mds_vms: 0
rgw_vms: 0
nfs_vms: 0
rbd_mirror_vms: 0
client_vms: 0
mgr_vms: 0
For a deployment that needs 1 MON and 1 OSD, the list would look like:
.. code-block:: yaml
mon_vms: 1
osd_vms: 1
* **CEPH SOURCE**: (string) indicate whether a ``dev`` or ``stable`` release is
needed. A ``stable`` release will use the latest stable release of Ceph,
a ``dev`` will use ``shaman`` (http://shaman.ceph.com)
* **SUBNETS**: These are used for configuring the network availability of each
server that will be booted as well as being used as configuration for
``ceph-ansible`` (and eventually Ceph). The two values that are **required**:
.. code-block:: yaml
public_subnet: 192.168.13
cluster_subnet: 192.168.14
* **MEMORY**: Memory requirements (in megabytes) for each server, e.g.
``memory: 512``
* **interfaces**: some vagrant boxes (and linux distros) set specific
interfaces. For Ubuntu releases older than Xenial it was common to have
``eth1``, for CentOS and some Xenial boxes ``enp0s8`` is used. **However**
the public Vagrant boxes normalize the interface to ``eth1`` for all boxes,
making it easier to configure them with Ansible later.
.. warning::
Do *not* change the interface from ``eth1`` unless absolutely
certain that is needed for a box. Some tests that depend on that
naming will fail.
* **disks**: The disks that will be created for each machine, for most
environments ``/dev/sd*`` style of disks will work, like: ``[ '/dev/sda', '/dev/sdb' ]``
* **vagrant_box**: We have published our own boxes to normalize what we test
against. These boxes are published in Atlas
(https://atlas.hashicorp.com/ceph/). Currently valid values are:
``ceph/ubuntu-xenial``, and ``ceph/centos7``
The following aren't usually changed/enabled for tests, since they don't have
an impact, however they are documented here for general knowledge in case they
are needed:
* **ssh_private_key_path**: The path to the ``id_rsa`` (or other private SSH
key) that should be used to connect to these boxes.
* **vagrant_sync_dir**: what should be "synced" (made available on the new
servers) from the host.
* **vagrant_disable_synced_folder**: (bool) when disabled, it will make
booting machines faster because no files need to be synced over.
* **os_tuning_params**: These are passed onto ``ceph-ansible`` as part of the
variables for "system tunning". These shouldn't be changed.
.. _vagrant_file:
``Vagrantfile``
---------------
The ``Vagrantfile`` should not need to change, and it is symlinked back to the
``Vagrantfile`` that exists in the root of the project. It is linked in this
way so that a vagrant environment can be isolated to the given scenario.
.. _hosts_file:
``hosts``
---------
The ``hosts`` file should contain the hosts needed for the scenario. This might
seem a bit repetitive since machines are already defined in
:ref:`vagrant_variables` but it allows granular changes to hosts (for example
defining different public_network values between monitors) which can help catch issues in
``ceph-ansible`` configuration. For example:
.. code-block:: ini
[mons]
mon0 public_network=192.168.1.0/24
mon1 public_network=192.168.2.0/24
mon2 public_network=192.168.3.0/24
.. _group_vars:
``group_vars``
--------------
This directory holds any configuration change that will affect ``ceph-ansible``
deployments in the same way as if ansible was executed from the root of the
project.
The file that will need to be defined always is ``all`` where (again) certain
values like ``public_network`` and ``cluster_network`` will need to be defined
along with any customizations that ``ceph-ansible`` supports.
.. _scenario_wiring:
Scenario Wiring
---------------
Scenarios are just meant to provide the Ceph environment for testing, but they
do need to be defined in the ``tox.ini`` so that they are available to the test
framework. To see a list of available scenarios, the following command (ran
from the root of the project) will list them, shortened for brevity:
.. code-block:: console
$ tox -l
...
luminous-ansible2.4-centos7_cluster
...
These scenarios are made from different variables, in the above command there
are 3:
* ``jewel``: the Ceph version to test
* ``ansible2.4``: the Ansible version to install
* ``centos7_cluster``: the name of the scenario
The last one is important in the *wiring up* of the scenario. It is a variable
that will define in what path the scenario lives. For example, the
``changedir`` section for ``centos7_cluster`` that looks like:
.. code-block:: ini
centos7_cluster: {toxinidir}/tests/functional/centos/7/cluster
The actual tests are written for specific daemon types, for all daemon types,
and for specific use cases (e.g. journal collocation), those have their own
conventions as well which are explained in detail in :ref:`test_conventions`
and :ref:`test_files`.
As long as a test scenario defines OSDs and MONs, the OSD tests and MON tests
will run.
.. _scenario_conventions:
Conventions
-----------
.. _scenario_environment_configuration:
Environment configuration
-------------------------
.. _scenario_ansible_configuration:
Ansible configuration
---------------------

View File

@ -0,0 +1,99 @@
.. _tests:
Tests
=====
Actual tests are written in Python methods that accept optional fixtures. These
fixtures come with interesting attributes to help with remote assertions.
As described in :ref:`test_conventions`, tests need to go into
``tests/functional/tests/``. These are collected and *mapped* to a distinct
node type, or *mapped* to run on all nodes.
Simple Python asserts are used (these tests do not need to follow the Python
``unittest.TestCase`` base class) that make it easier to reason about failures
and errors.
The test run is handled by ``py.test`` along with :ref:`testinfra` for handling
remote execution.
.. _test_files:
Test Files
----------
.. _test_fixtures:
Test Fixtures
=============
Test fixtures are a powerful feature of ``py.test`` and most tests depend on
this for making assertions about remote nodes. To request them in a test
method, all that is needed is to require it as an argument.
Fixtures are detected by name, so as long as the argument being used has the
same name, the fixture will be passed in (see `pytest fixtures`_ for more
in-depth examples). The code that follows shows a test method that will use the
``node`` fixture that contains useful information about a node in a ceph
cluster:
.. code-block:: python
def test_ceph_conf(self, node):
assert node['conf_path'] == "/etc/ceph/ceph.conf"
The test is naive (the configuration path might not exist remotely) but
explains how simple it is to "request" a fixture.
For remote execution, we can rely further on other fixtures (tests can have as
many fixtures as needed) like ``File``:
.. code-block:: python
def test_ceph_config_has_inital_members_line(self, node, File):
assert File(node["conf_path"]).contains("^mon initial members = .*$")
.. _node:
``node`` fixture
----------------
The ``node`` fixture contains a few useful pieces of information about the node
where the test is being executed, this is captured once, before tests run:
* ``address``: The IP for the ``eth1`` interface
* ``subnet``: The subnet that ``address`` belongs to
* ``vars``: all the Ansible vars set for the current run
* ``osd_ids``: a list of all the OSD IDs
* ``num_mons``: the total number of monitors for the current environment
* ``num_devices``: the number of devices for the current node
* ``num_osd_hosts``: the total number of OSD hosts
* ``total_osds``: total number of OSDs on the current node
* ``cluster_name``: the name of the Ceph cluster (which defaults to 'ceph')
* ``conf_path``: since the cluster name can change the file path for the Ceph
configuration, this gets sets according to the cluster name.
* ``cluster_address``: the address used for cluster communication. All
environments are set up with 2 interfaces, 1 being used exclusively for the
cluster
* ``docker``: A boolean that identifies a Ceph Docker cluster
* ``osds``: A list of OSD IDs, unless it is a Docker cluster, where it gets the
name of the devices (e.g. ``sda1``)
Other Fixtures
--------------
There are a lot of other fixtures provided by :ref:`testinfra` as well as
``py.test``. The full list of ``testinfra`` fixtures are available in
`testinfra_fixtures`_
``py.test`` builtin fixtures can be listed with ``pytest -q --fixtures`` and
they are described in `pytest builtin fixtures`_
.. _pytest fixtures: https://docs.pytest.org/en/latest/fixture.html
.. _pytest builtin fixtures: https://docs.pytest.org/en/latest/builtin.html#builtin-fixtures-function-arguments
.. _testinfra_fixtures: https://testinfra.readthedocs.io/en/latest/modules.html#modules

View File

@ -0,0 +1,75 @@
.. _tox:
``tox``
=======
``tox`` is an automation project we use to run our testing scenarios. It gives us
the ability to create a dynamic matrix of many testing scenarios, isolated testing environments
and a provides a single entry point to run all tests in an automated and repeatable fashion.
Documentation for tox can be found `here <https://tox.readthedocs.io/en/latest/>`_.
.. _tox_environment_variables:
Environment variables
---------------------
When running ``tox`` we've allowed for the usage of environment variables to tweak certain settings
of the playbook run using Ansible's ``--extra-vars``. It's helpful in Jenkins jobs or for manual test
runs of ``ceph-ansible``.
The following environent variables are available for use:
* ``CEPH_DOCKER_REGISTRY``: (default: ``quay.io``) This would configure the ``ceph-ansible`` variable ``ceph_docker_registry``.
* ``CEPH_DOCKER_IMAGE``: (default: ``ceph/daemon``) This would configure the ``ceph-ansible`` variable ``ceph_docker_image``.
* ``CEPH_DOCKER_IMAGE_TAG``: (default: ``latest``) This would configure the ``ceph-ansible`` variable ``ceph_docker_image_name``.
* ``CEPH_DEV_BRANCH``: (default: ``main``) This would configure the ``ceph-ansible`` variable ``ceph_dev_branch`` which defines which branch we'd
like to install from shaman.ceph.com.
* ``CEPH_DEV_SHA1``: (default: ``latest``) This would configure the ``ceph-ansible`` variable ``ceph_dev_sha1`` which defines which sha1 we'd like
to install from shaman.ceph.com.
* ``UPDATE_CEPH_DEV_BRANCH``: (default: ``main``) This would configure the ``ceph-ansible`` variable ``ceph_dev_branch`` which defines which branch we'd
like to update to from shaman.ceph.com.
* ``UPDATE_CEPH_DEV_SHA1``: (default: ``latest``) This would configure the ``ceph-ansible`` variable ``ceph_dev_sha1`` which defines which sha1 we'd like
to update to from shaman.ceph.com.
.. _tox_sections:
Sections
--------
The ``tox.ini`` file has a number of top level sections defined by ``[ ]`` and subsections within those. For complete documentation
on all subsections inside of a tox section please refer to the tox documentation.
* ``tox`` : This section contains the ``envlist`` which is used to create our dynamic matrix. Refer to the `section here <http://tox.readthedocs.io/en/latest/config.html#generating-environments-conditional-settings>`_ for more information on how the ``envlist`` works.
* ``purge`` : This section contains commands that only run for scenarios that purge the cluster and redeploy. You'll see this section being reused in ``testenv``
with the following syntax: ``{[purge]commands}``
* ``update`` : This section contains commands taht only run for scenarios that deploy a cluster and then upgrade it to another Ceph version.
* ``testenv`` : This is the main section of the ``tox.ini`` file and is run on every scenario. This section contains many *factors* that define conditional
settings depending on the scenarios defined in the ``envlist``. For example, the factor ``centos7_cluster`` in the ``changedir`` subsection of ``testenv`` sets
the directory that tox will change do when that factor is selected. This is an important behavior that allows us to use the same ``tox.ini`` and reuse commands while
tweaking certain sections per testing scenario.
.. _tox_environments:
Modifying or Adding environments
--------------------------------
The tox environments are controlled by the ``envlist`` subsection of the ``[tox]`` section. Anything inside of ``{}`` is considered a *factor* and will be included
in the dynamic matrix that tox creates. Inside of ``{}`` you can include a comma separated list of the *factors*. Do not use a hyphen (``-``) as part
of the *factor* name as those are used by tox as the separator between different factor sets.
For example, if wanted to add a new test *factor* for the next Ceph release of luminious this is how you'd accomplish that. Currently, the first factor set in our ``envlist``
is used to define the Ceph release (``{jewel,kraken}-...``). To add luminous you'd change that to look like ``{luminous,kraken}-...``. In the ``testenv`` section
this is a subsection called ``setenv`` which allows you to provide environment variables to the tox environment and we support an environment variable called ``CEPH_STABLE_RELEASE``. To ensure that all the new tests that are created by adding the luminous *factor* you'd do this in that section: ``luminous: CEPH_STABLE_RELEASE=luminous``.

10
docs/tox.ini Normal file
View File

@ -0,0 +1,10 @@
[tox]
envlist = docs
skipsdist = True
[testenv:docs]
basepython=python
changedir=source
deps=sphinx==1.7.9
commands=
sphinx-build -W -b html -d {envtmpdir}/doctrees . {envtmpdir}/html

4
dummy-ansible-hosts Normal file
View File

@ -0,0 +1,4 @@
# Dummy ansible host file
# Used for syntax check by Travis
# Before committing code please run: ansible-playbook --syntax-check site.yml -i dummy-ansible-hosts
localhost

78
generate_group_vars_sample.sh Executable file
View File

@ -0,0 +1,78 @@
#!/usr/bin/env bash
set -euo pipefail
#############
# VARIABLES #
#############
basedir=$(dirname "$0")
do_not_generate="(ceph-common|ceph-container-common|ceph-fetch-keys)$" # pipe separated list of roles we don't want to generate sample file, MUST end with '$', e.g: 'foo$|bar$'
#############
# FUNCTIONS #
#############
populate_header () {
for i in $output; do
cat <<EOF > "$basedir"/group_vars/"$i"
---
# Variables here are applicable to all host groups NOT roles
# This sample file generated by $(basename "$0")
# Dummy variable to avoid error because ansible does not recognize the
# file as a good configuration file when no variable in it.
dummy:
EOF
done
}
generate_group_vars_file () {
for i in $output; do
if [ "$(uname)" == "Darwin" ]; then
sed '/^---/d; s/^\([A-Za-z[:space:]]\)/#\1/' \
"$defaults" >> "$basedir"/group_vars/"$i"
echo >> "$basedir"/group_vars/"$i"
elif [ "$(uname -s)" == "Linux" ]; then
sed '/^---/d; s/^\([A-Za-z[:space:]].\+\)/#\1/' \
"$defaults" >> "$basedir"/group_vars/"$i"
echo >> "$basedir"/group_vars/"$i"
else
echo "Unsupported platform"
exit 1
fi
done
}
########
# MAIN #
########
for role in "$basedir"/roles/ceph-*; do
rolename=$(basename "$role")
if [[ $rolename == "ceph-defaults" ]]; then
output="all.yml.sample"
elif [[ $rolename == "ceph-fetch-keys" ]]; then
output="ceph-fetch-keys.yml.sample"
elif [[ $rolename == "ceph-rbd-mirror" ]]; then
output="rbdmirrors.yml.sample"
elif [[ $rolename == "ceph-rgw-loadbalancer" ]]; then
output="rgwloadbalancers.yml.sample"
else
output="${rolename:5}s.yml.sample"
fi
defaults="$role"/defaults/main.yml
if [[ ! -f $defaults ]]; then
continue
fi
if ! echo "$rolename" | grep -qE "$do_not_generate"; then
populate_header
generate_group_vars_file
fi
done

666
group_vars/all.yml Normal file
View File

@ -0,0 +1,666 @@
---
# Variables here are applicable to all host groups NOT roles
# This sample file generated by generate_group_vars_sample.sh
# Dummy variable to avoid error because ansible does not recognize the
# file as a good configuration file when no variable in it.
dummy:
# You can override vars by using host or group vars
###########
# GENERAL #
###########
######################################
# Releases name to number dictionary #
######################################
ceph_release_num:
dumpling: 0.67
emperor: 0.72
firefly: 0.80
giant: 0.87
hammer: 0.94
infernalis: 9
jewel: 10
kraken: 11
luminous: 12
mimic: 13
nautilus: 14
octopus: 15
pacific: 16
quincy: 17
reef: 18
squid: 19
dev: 99
# The 'cluster' variable determines the name of the cluster.
# Changing the default value to something else means that you will
# need to change all the command line calls as well, for example if
# your cluster name is 'foo':
# "ceph health" will become "ceph --cluster foo health"
#
# An easier way to handle this is to use the environment variable CEPH_ARGS
# So run: "export CEPH_ARGS="--cluster foo"
# With that you will be able to run "ceph health" normally
cluster: ceph
# Inventory host group variables
mon_group_name: mons
osd_group_name: osds
#rgw_group_name: rgws
#mds_group_name: mdss
#nfs_group_name: nfss
#rbdmirror_group_name: rbdmirrors
#client_group_name: clients
mgr_group_name: mgrs
#rgwloadbalancer_group_name: rgwloadbalancers
#monitoring_group_name: monitoring
adopt_label_group_names:
- "{{ mon_group_name }}"
- "{{ osd_group_name }}"
# - "{{ rgw_group_name }}"
# - "{{ mds_group_name }}"
# - "{{ nfs_group_name }}"
# - "{{ rbdmirror_group_name }}"
# - "{{ client_group_name }}"
- "{{ mgr_group_name }}"
# - "{{ rgwloadbalancer_group_name }}"
# - "{{ monitoring_group_name }}"
# If configure_firewall is true, then ansible will try to configure the
# appropriate firewalling rules so that Ceph daemons can communicate
# with each others.
configure_firewall: false
# Open ports on corresponding nodes if firewall is installed on it
#ceph_mon_firewall_zone: public
#ceph_mgr_firewall_zone: public
#ceph_osd_firewall_zone: public
#ceph_rgw_firewall_zone: public
#ceph_mds_firewall_zone: public
#ceph_nfs_firewall_zone: public
#ceph_rbdmirror_firewall_zone: public
#ceph_dashboard_firewall_zone: public
#ceph_rgwloadbalancer_firewall_zone: public
# cephadm account for remote connections
cephadm_ssh_user: root
cephadm_ssh_priv_key_path: "/home/{{ cephadm_ssh_user }}/.ssh/id_rsa"
cephadm_ssh_pub_key_path: "{{ cephadm_ssh_priv_key_path }}.pub"
cephadm_mgmt_network: "{{ public_network }}"
############
# PACKAGES #
############
#debian_package_dependencies: []
#centos_package_dependencies:
# - epel-release
# - "{{ (ansible_facts['distribution_major_version'] is version('8', '>=')) | ternary('python3-libselinux', 'libselinux-python') }}"
#redhat_package_dependencies: []
#suse_package_dependencies: []
# Whether or not to install the ceph-test package.
ceph_test: false
# Enable the ntp service by default to avoid clock skew on ceph nodes
# Disable if an appropriate NTP client is already installed and configured
ntp_service_enabled: true
# Set type of NTP client daemon to use, valid entries are chronyd, ntpd or timesyncd
ntp_daemon_type: chronyd
# This variable determines if ceph packages can be updated. If False, the
# package resources will use "state=present". If True, they will use
# "state=latest".
#upgrade_ceph_packages: false
#ceph_use_distro_backports: false # DEBIAN ONLY
#ceph_directories_mode: "0755"
###########
# INSTALL #
###########
# ORIGIN SOURCE
#
# Choose between:
# - 'repository' means that you will get ceph installed through a new repository. Later below choose between 'community', 'dev' or 'obs'
# - 'distro' means that no separate repo file will be added
# you will get whatever version of Ceph is included in your Linux distro.
# 'local' means that the ceph binaries will be copied over from the local machine
ceph_origin: repository
#valid_ceph_origins:
# - repository
# - distro
# - local
ceph_repository: community
#valid_ceph_repository:
# - community
# - dev
# - uca
# - custom
# - obs
# REPOSITORY: COMMUNITY VERSION
#
# Enabled when ceph_repository == 'community'
#
ceph_mirror: https://download.ceph.com
ceph_stable_key: https://download.ceph.com/keys/release.asc
ceph_stable_release: squid
ceph_stable_repo: "{{ ceph_mirror }}/debian-{{ ceph_stable_release }}"
#nfs_ganesha_stable: true # use stable repos for nfs-ganesha
#centos_release_nfs: centos-release-nfs-ganesha4
#nfs_ganesha_stable_deb_repo: http://ppa.launchpad.net/nfs-ganesha/nfs-ganesha-4/ubuntu
#nfs_ganesha_apt_keyserver: keyserver.ubuntu.com
#nfs_ganesha_apt_key_id: EA914D611053D07BD332E18010353E8834DC57CA
#libntirpc_stable_deb_repo: http://ppa.launchpad.net/nfs-ganesha/libntirpc-4/ubuntu
# Use the option below to specify your applicable package tree, eg. when using non-LTS Ubuntu versions
# # for a list of available Debian distributions, visit http://download.ceph.com/debian-{{ ceph_stable_release }}/dists/
# for more info read: https://github.com/ceph/ceph-ansible/issues/305
# ceph_stable_distro_source: "{{ ansible_facts['distribution_release'] }}"
# REPOSITORY: UBUNTU CLOUD ARCHIVE
#
# Enabled when ceph_repository == 'uca'
#
# This allows the install of Ceph from the Ubuntu Cloud Archive. The Ubuntu Cloud Archive
# usually has newer Ceph releases than the normal distro repository.
#
#
#ceph_stable_repo_uca: "http://ubuntu-cloud.archive.canonical.com/ubuntu"
#ceph_stable_openstack_release_uca: queens
#ceph_stable_release_uca: "{{ ansible_facts['distribution_release'] }}-updates/{{ ceph_stable_openstack_release_uca }}"
# REPOSITORY: openSUSE OBS
#
# Enabled when ceph_repository == 'obs'
#
# This allows the install of Ceph from the openSUSE OBS repository. The OBS repository
# usually has newer Ceph releases than the normal distro repository.
#
#
#ceph_obs_repo: "https://download.opensuse.org/repositories/filesystems:/ceph:/{{ ceph_stable_release }}/openSUSE_Leap_{{ ansible_facts['distribution_version'] }}/"
# REPOSITORY: DEV
#
# Enabled when ceph_repository == 'dev'
#
#ceph_dev_branch: main # development branch you would like to use e.g: main, wip-hack
#ceph_dev_sha1: latest # distinct sha1 to use, defaults to 'latest' (as in latest built)
#nfs_ganesha_dev: false # use development repos for nfs-ganesha
# Set this to choose the version of ceph dev libraries used in the nfs-ganesha packages from shaman
# flavors so far include: ceph_main, ceph_jewel, ceph_kraken, ceph_luminous
#nfs_ganesha_flavor: "ceph_main"
# REPOSITORY: CUSTOM
#
# Enabled when ceph_repository == 'custom'
#
# Use a custom repository to install ceph. For RPM, ceph_custom_repo should be
# a URL to the .repo file to be installed on the targets. For deb,
# ceph_custom_repo should be the URL to the repo base.
#
# ceph_custom_key: https://server.domain.com/ceph-custom-repo/key.asc
#ceph_custom_repo: https://server.domain.com/ceph-custom-repo
# ORIGIN: LOCAL CEPH INSTALLATION
#
# Enabled when ceph_repository == 'local'
#
# Path to DESTDIR of the ceph install
# ceph_installation_dir: "/path/to/ceph_installation/"
# Whether or not to use installer script rundep_installer.sh
# This script takes in rundep and installs the packages line by line onto the machine
# If this is set to false then it is assumed that the machine ceph is being copied onto will already have
# all runtime dependencies installed
# use_installer: false
# Root directory for ceph-ansible
# ansible_dir: "/path/to/ceph-ansible"
######################
# CEPH CONFIGURATION #
######################
## Ceph options
#
# Each cluster requires a unique, consistent filesystem ID. By
# default, the playbook generates one for you.
# If you want to customize how the fsid is
# generated, you may find it useful to disable fsid generation to
# avoid cluttering up your ansible repo. If you set `generate_fsid` to
# false, you *must* generate `fsid` in another way.
# ACTIVATE THE FSID VARIABLE FOR NON-VAGRANT DEPLOYMENT
#fsid: "{{ cluster_uuid.stdout }}"
generate_fsid: true
ceph_conf_key_directory: /etc/ceph
ceph_uid: "{{ '64045' if not containerized_deployment | bool and ansible_facts['os_family'] == 'Debian' else '167' }}"
# Permissions for keyring files in /etc/ceph
ceph_keyring_permissions: '0600'
#cephx: true
# Cluster configuration
ceph_cluster_conf:
global:
public_network: "{{ public_network | default(omit) }}"
cluster_network: "{{ cluster_network | default(omit) }}"
osd_pool_default_crush_rule: "{{ osd_pool_default_crush_rule }}"
# ms_bind_ipv6: "{{ (ip_version == 'ipv6') | string }}"
ms_bind_ipv4: "{{ (ip_version == 'ipv4') | string }}"
osd_crush_chooseleaf_type: "{{ '0' if common_single_host_mode | default(false) else omit }}"
## Client options
#
#rbd_cache: "true"
#rbd_cache_writethrough_until_flush: "true"
#rbd_concurrent_management_ops: 20
#rbd_client_directories: true # this will create rbd_client_log_path and rbd_client_admin_socket_path directories with proper permissions
# Permissions for the rbd_client_log_path and
# rbd_client_admin_socket_path. Depending on your use case for Ceph
# you may want to change these values. The default, which is used if
# any of the variables are unset or set to a false value (like `null`
# or `false`) is to automatically determine what is appropriate for
# the Ceph version with non-OpenStack workloads -- ceph:ceph and 0770
# for infernalis releases, and root:root and 1777 for pre-infernalis
# releases.
#
# For other use cases, including running Ceph with OpenStack, you'll
# want to set these differently:
#
# For OpenStack on RHEL, you'll want:
# rbd_client_directory_owner: "qemu"
# rbd_client_directory_group: "libvirtd" (or "libvirt", depending on your version of libvirt)
# rbd_client_directory_mode: "0755"
#
# For OpenStack on Ubuntu or Debian, set:
# rbd_client_directory_owner: "libvirt-qemu"
# rbd_client_directory_group: "kvm"
# rbd_client_directory_mode: "0755"
#
# If you set rbd_client_directory_mode, you must use a string (e.g.,
# 'rbd_client_directory_mode: "0755"', *not*
# 'rbd_client_directory_mode: 0755', or Ansible will complain: mode
# must be in octal or symbolic form
#rbd_client_directory_owner: ceph
#rbd_client_directory_group: ceph
#rbd_client_directory_mode: "0755"
#rbd_client_log_path: /var/log/ceph
#rbd_client_log_file: "{{ rbd_client_log_path }}/qemu-guest-$pid.log" # must be writable by QEMU and allowed by SELinux or AppArmor
#rbd_client_admin_socket_path: /var/run/ceph # must be writable by QEMU and allowed by SELinux or AppArmor
## Monitor options
# set to either ipv4 or ipv6, whichever your network is using
ip_version: ipv4
mon_host_v1:
enabled: true
suffix: ':6789'
mon_host_v2:
suffix: ':3300'
#enable_ceph_volume_debug: false
##########
# CEPHFS #
##########
# When pg_autoscale_mode is set to True, you must add the target_size_ratio key with a correct value
# `pg_num` and `pgp_num` keys will be ignored, even if specified.
# eg:
# cephfs_data_pool:
# name: "{{ cephfs_data if cephfs_data is defined else 'cephfs_data' }}"
# target_size_ratio: 0.2
#cephfs: cephfs # name of the ceph filesystem
#cephfs_data_pool:
# name: "{{ cephfs_data if cephfs_data is defined else 'cephfs_data' }}"
#cephfs_metadata_pool:
# name: "{{ cephfs_metadata if cephfs_metadata is defined else 'cephfs_metadata' }}"
#cephfs_pools:
# - "{{ cephfs_data_pool }}"
# - "{{ cephfs_metadata_pool }}"
## OSD options
#
#lvmetad_disabled: false
#is_hci: false
#hci_safety_factor: 0.2
#non_hci_safety_factor: 0.7
#safety_factor: "{{ hci_safety_factor if is_hci | bool else non_hci_safety_factor }}"
#osd_memory_target: 4294967296
#journal_size: 5120 # OSD journal size in MB
#block_db_size: -1 # block db size in bytes for the ceph-volume lvm batch. -1 means use the default of 'as big as possible'.
public_network: 192.168.1.0/24
cluster_network: "{{ public_network | regex_replace(' ', '') }}"
#osd_mkfs_type: xfs
#osd_mkfs_options_xfs: -f -i size=2048
#osd_mount_options_xfs: noatime,largeio,inode64,swalloc
osd_objectstore: bluestore
# Any device containing these patterns in their path will be excluded.
#osd_auto_discovery_exclude: "dm-*|loop*|md*|rbd*"
## MDS options
#
#mds_max_mds: 1
## Rados Gateway options
#
#radosgw_frontend_type: beast # For additional frontends see: https://docs.ceph.com/en/latest/radosgw/frontends/
#radosgw_frontend_port: 8080
# The server private key, public certificate and any other CA or intermediate certificates should be in one file, in PEM format.
#radosgw_frontend_ssl_certificate: ""
#radosgw_frontend_ssl_certificate_data: "" # certificate contents to be written to path defined by radosgw_frontend_ssl_certificate
#radosgw_frontend_options: ""
#radosgw_thread_pool_size: 512
# You must define either radosgw_interface, radosgw_address.
# These variables must be defined at least in all.yml and overrided if needed (inventory host file or group_vars/*.yml).
# Eg. If you want to specify for each radosgw node which address the radosgw will bind to you can set it in your **inventory host file** by using 'radosgw_address' variable.
# Preference will go to radosgw_address if both radosgw_address and radosgw_interface are defined.
#radosgw_interface: interface
#radosgw_address: x.x.x.x
#radosgw_address_block: subnet
#radosgw_keystone_ssl: false # activate this when using keystone PKI keys
#radosgw_num_instances: 1
#rgw_zone: default # This is used for rgw instance client names.
## Testing mode
# enable this mode _only_ when you have a single node
# if you don't want it keep the option commented
# common_single_host_mode: true
## Handlers - restarting daemons after a config change
# if for whatever reasons the content of your ceph configuration changes
# ceph daemons will be restarted as well. At the moment, we can not detect
# which config option changed so all the daemons will be restarted. Although
# this restart will be serialized for each node, in between a health check
# will be performed so we make sure we don't move to the next node until
# ceph is not healthy
# Obviously between the checks (for monitors to be in quorum and for osd's pgs
# to be clean) we have to wait. These retries and delays can be configurable
# for both monitors and osds.
#
# Monitor handler checks
#handler_health_mon_check_retries: 10
#handler_health_mon_check_delay: 20
#
# OSD handler checks
#handler_health_osd_check_retries: 40
#handler_health_osd_check_delay: 30
#handler_health_osd_check: true
#
# MDS handler checks
#handler_health_mds_check_retries: 5
#handler_health_mds_check_delay: 10
#
# RGW handler checks
#handler_health_rgw_check_retries: 5
#handler_health_rgw_check_delay: 10
#handler_rgw_use_haproxy_maintenance: false
# NFS handler checks
#handler_health_nfs_check_retries: 5
#handler_health_nfs_check_delay: 10
# RBD MIRROR handler checks
#handler_health_rbd_mirror_check_retries: 5
#handler_health_rbd_mirror_check_delay: 10
# MGR handler checks
#handler_health_mgr_check_retries: 5
#handler_health_mgr_check_delay: 10
## health mon/osds check retries/delay:
#health_mon_check_retries: 20
#health_mon_check_delay: 10
#health_osd_check_retries: 20
#health_osd_check_delay: 10
##############
# RBD-MIRROR #
##############
#ceph_rbd_mirror_pool: "rbd"
###############
# NFS-GANESHA #
###############
#
# Access type options
#
# Enable NFS File access
# If set to true, then ganesha is set up to export the root of the
# Ceph filesystem, and ganesha's attribute and directory caching is disabled
# as much as possible since libcephfs clients also caches the same
# information.
#
# Set this to true to enable File access via NFS. Requires an MDS role.
#nfs_file_gw: false
# Set this to true to enable Object access via NFS. Requires an RGW role.
#nfs_obj_gw: "{{ False if groups.get(mon_group_name, []) | length == 0 else True }}"
###################
# CONFIG OVERRIDE #
###################
# Ceph configuration file override.
# This allows you to specify more configuration options
# using an INI style format.
#
# When configuring RGWs, make sure you use the form [client.rgw.*]
# instead of [client.radosgw.*].
# For more examples check the profiles directory of https://github.com/ceph/ceph-ansible.
#
# The following sections are supported: [global], [mon], [osd], [mds], [client]
#
# Example:
# ceph_conf_overrides:
# global:
# foo: 1234
# bar: 5678
# "client.rgw.{{ rgw_zone }}.{{ hostvars[groups.get(rgw_group_name)[0]]['ansible_facts']['hostname'] }}":
# rgw_zone: zone1
#
#ceph_conf_overrides: {}
#############
# OS TUNING #
#############
#disable_transparent_hugepage: "{{ false if osd_objectstore == 'bluestore' }}"
#os_tuning_params:
# - { name: fs.file-max, value: 26234859 }
# - { name: vm.zone_reclaim_mode, value: 0 }
# - { name: vm.swappiness, value: 10 }
# - { name: vm.min_free_kbytes, value: "{{ vm_min_free_kbytes }}" }
# For Debian & Red Hat/CentOS installs set TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES
# Set this to a byte value (e.g. 134217728)
# A value of 0 will leave the package default.
#ceph_tcmalloc_max_total_thread_cache: 134217728
##########
# DOCKER #
##########
#ceph_docker_image: "ceph/ceph"
#ceph_docker_image_tag: v19
#ceph_docker_registry: quay.io
#ceph_docker_registry_auth: false
# ceph_docker_registry_username:
# ceph_docker_registry_password:
# ceph_docker_http_proxy:
# ceph_docker_https_proxy:
#ceph_docker_no_proxy: "localhost,127.0.0.1"
## Client only docker image - defaults to {{ ceph_docker_image }}
#ceph_client_docker_image: "{{ ceph_docker_image }}"
#ceph_client_docker_image_tag: "{{ ceph_docker_image_tag }}"
#ceph_client_docker_registry: "{{ ceph_docker_registry }}"
containerized_deployment: false
#container_binary:
#timeout_command: "{{ 'timeout --foreground -s KILL ' ~ docker_pull_timeout if (docker_pull_timeout != '0') and (ceph_docker_dev_image is undefined or not ceph_docker_dev_image) else '' }}"
#ceph_common_container_params:
# envs:
# NODE_NAME: "{{ ansible_facts['hostname'] }}"
# CONTAINER_IMAGE: "{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}"
# TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES: "{{ ceph_tcmalloc_max_total_thread_cache }}"
# args:
# - --setuser=ceph
# - --setgroup=ceph
# - --default-log-to-file=false
# - --default-log-to-stderr=true
# - --default-log-stderr-prefix="debug "
# volumes:
# - /var/lib/ceph/crash:/var/lib/ceph/crash:z
# - /var/run/ceph:/var/run/ceph:z
# - /var/log/ceph:/var/log/ceph:z
# - /etc/ceph:/etc/ceph:z
# - /etc/localtime:/etc/localtime:ro
# this is only here for usage with the rolling_update.yml playbook
# do not ever change this here
#rolling_update: false
#####################
# Docker pull retry #
#####################
#docker_pull_retry: 3
#docker_pull_timeout: "300s"
#############
# DASHBOARD #
#############
dashboard_enabled: false
# Choose http or https
# For https, you should set dashboard.crt/key and grafana.crt/key
# If you define the dashboard_crt and dashboard_key variables, but leave them as '',
# then we will autogenerate a cert and keyfile
#dashboard_protocol: https
#dashboard_port: 8443
# set this variable to the network you want the dashboard to listen on. (Default to public_network)
#dashboard_network: "{{ public_network }}"
#dashboard_admin_user: admin
#dashboard_admin_user_ro: false
# This variable must be set with a strong custom password when dashboard_enabled is True
# dashboard_admin_password: p@ssw0rd
# We only need this for SSL (https) connections
#dashboard_crt: ''
#dashboard_key: ''
#dashboard_certificate_cn: ceph-dashboard
#dashboard_tls_external: false
#dashboard_grafana_api_no_ssl_verify: "{{ true if dashboard_protocol == 'https' and not grafana_crt and not grafana_key else false }}"
#dashboard_rgw_api_user_id: ceph-dashboard
#dashboard_rgw_api_admin_resource: ''
#dashboard_rgw_api_no_ssl_verify: false
#dashboard_frontend_vip: ''
#dashboard_disabled_features: []
#prometheus_frontend_vip: ''
#alertmanager_frontend_vip: ''
#node_exporter_container_image: "docker.io/prom/node-exporter:v0.17.0"
#node_exporter_port: 9100
#grafana_admin_user: admin
# This variable must be set with a strong custom password when dashboard_enabled is True
# grafana_admin_password: admin
# We only need this for SSL (https) connections
#grafana_crt: ''
#grafana_key: ''
# When using https, please fill with a hostname for which grafana_crt is valid.
#grafana_server_fqdn: ''
#grafana_container_image: "docker.io/grafana/grafana:6.7.4"
#grafana_container_cpu_period: 100000
#grafana_container_cpu_cores: 2
# container_memory is in GB
#grafana_container_memory: 4
#grafana_uid: 472
#grafana_datasource: Dashboard
#grafana_dashboards_path: "/etc/grafana/dashboards/ceph-dashboard"
#grafana_dashboard_version: main
#grafana_dashboard_files:
# - ceph-cluster.json
# - cephfs-overview.json
# - host-details.json
# - hosts-overview.json
# - osd-device-details.json
# - osds-overview.json
# - pool-detail.json
# - pool-overview.json
# - radosgw-detail.json
# - radosgw-overview.json
# - radosgw-sync-overview.json
# - rbd-details.json
# - rbd-overview.json
#grafana_plugins:
# - vonage-status-panel
# - grafana-piechart-panel
#grafana_allow_embedding: true
#grafana_port: 3000
#grafana_network: "{{ public_network }}"
#grafana_conf_overrides: {}
#prometheus_container_image: "docker.io/prom/prometheus:v2.7.2"
#prometheus_container_cpu_period: 100000
#prometheus_container_cpu_cores: 2
# container_memory is in GB
#prometheus_container_memory: 4
#prometheus_data_dir: /var/lib/prometheus
#prometheus_conf_dir: /etc/prometheus
#prometheus_user_id: '65534' # This is the UID used by the prom/prometheus container image
#prometheus_port: 9092
#prometheus_conf_overrides: {}
# Uncomment out this variable if you need to customize the retention period for prometheus storage.
# set it to '30d' if you want to retain 30 days of data.
# prometheus_storage_tsdb_retention_time: 15d
#alertmanager_container_image: "docker.io/prom/alertmanager:v0.16.2"
#alertmanager_container_cpu_period: 100000
#alertmanager_container_cpu_cores: 2
# container_memory is in GB
#alertmanager_container_memory: 4
#alertmanager_data_dir: /var/lib/alertmanager
#alertmanager_conf_dir: /etc/alertmanager
#alertmanager_port: 9093
#alertmanager_cluster_port: 9094
#alertmanager_conf_overrides: {}
#alertmanager_dashboard_api_no_ssl_verify: "{{ true if dashboard_protocol == 'https' and not dashboard_crt and not dashboard_key else false }}"
#no_log_on_ceph_key_tasks: true
###############
# DEPRECATION #
###############
######################################################
# VARIABLES BELOW SHOULD NOT BE MODIFIED BY THE USER #
# *DO NOT* MODIFY THEM #
######################################################
#container_exec_cmd:
#docker: false
#ceph_volume_debug: "{{ enable_ceph_volume_debug | ternary(1, 0) }}"

View File

@ -0,0 +1,667 @@
---
# Variables here are applicable to all host groups NOT roles
# This sample file generated by generate_group_vars_sample.sh
# Dummy variable to avoid error because ansible does not recognize the
# file as a good configuration file when no variable in it.
dummy:
# You can override vars by using host or group vars
###########
# GENERAL #
###########
######################################
# Releases name to number dictionary #
######################################
#ceph_release_num:
# dumpling: 0.67
# emperor: 0.72
# firefly: 0.80
# giant: 0.87
# hammer: 0.94
# infernalis: 9
# jewel: 10
# kraken: 11
# luminous: 12
# mimic: 13
# nautilus: 14
# octopus: 15
# pacific: 16
# quincy: 17
# reef: 18
# squid: 19
# dev: 99
# The 'cluster' variable determines the name of the cluster.
# Changing the default value to something else means that you will
# need to change all the command line calls as well, for example if
# your cluster name is 'foo':
# "ceph health" will become "ceph --cluster foo health"
#
# An easier way to handle this is to use the environment variable CEPH_ARGS
# So run: "export CEPH_ARGS="--cluster foo"
# With that you will be able to run "ceph health" normally
#cluster: ceph
# Inventory host group variables
#mon_group_name: mons
#osd_group_name: osds
#rgw_group_name: rgws
#mds_group_name: mdss
#nfs_group_name: nfss
#rbdmirror_group_name: rbdmirrors
#client_group_name: clients
#mgr_group_name: mgrs
#rgwloadbalancer_group_name: rgwloadbalancers
#monitoring_group_name: monitoring
#adopt_label_group_names:
# - "{{ mon_group_name }}"
# - "{{ osd_group_name }}"
# - "{{ rgw_group_name }}"
# - "{{ mds_group_name }}"
# - "{{ nfs_group_name }}"
# - "{{ rbdmirror_group_name }}"
# - "{{ client_group_name }}"
# - "{{ mgr_group_name }}"
# - "{{ rgwloadbalancer_group_name }}"
# - "{{ monitoring_group_name }}"
# If configure_firewall is true, then ansible will try to configure the
# appropriate firewalling rules so that Ceph daemons can communicate
# with each others.
#configure_firewall: true
# Open ports on corresponding nodes if firewall is installed on it
#ceph_mon_firewall_zone: public
#ceph_mgr_firewall_zone: public
#ceph_osd_firewall_zone: public
#ceph_rgw_firewall_zone: public
#ceph_mds_firewall_zone: public
#ceph_nfs_firewall_zone: public
#ceph_rbdmirror_firewall_zone: public
#ceph_dashboard_firewall_zone: public
#ceph_rgwloadbalancer_firewall_zone: public
# cephadm account for remote connections
#cephadm_ssh_user: root
#cephadm_ssh_priv_key_path: "/home/{{ cephadm_ssh_user }}/.ssh/id_rsa"
#cephadm_ssh_pub_key_path: "{{ cephadm_ssh_priv_key_path }}.pub"
#cephadm_mgmt_network: "{{ public_network }}"
############
# PACKAGES #
############
#debian_package_dependencies: []
#centos_package_dependencies:
# - epel-release
# - "{{ (ansible_facts['distribution_major_version'] is version('8', '>=')) | ternary('python3-libselinux', 'libselinux-python') }}"
#redhat_package_dependencies: []
#suse_package_dependencies: []
# Whether or not to install the ceph-test package.
#ceph_test: false
# Enable the ntp service by default to avoid clock skew on ceph nodes
# Disable if an appropriate NTP client is already installed and configured
#ntp_service_enabled: true
# Set type of NTP client daemon to use, valid entries are chronyd, ntpd or timesyncd
#ntp_daemon_type: chronyd
# This variable determines if ceph packages can be updated. If False, the
# package resources will use "state=present". If True, they will use
# "state=latest".
#upgrade_ceph_packages: false
#ceph_use_distro_backports: false # DEBIAN ONLY
#ceph_directories_mode: "0755"
###########
# INSTALL #
###########
# ORIGIN SOURCE
#
# Choose between:
# - 'repository' means that you will get ceph installed through a new repository. Later below choose between 'community', 'dev' or 'obs'
# - 'distro' means that no separate repo file will be added
# you will get whatever version of Ceph is included in your Linux distro.
# 'local' means that the ceph binaries will be copied over from the local machine
#ceph_origin: dummy
#valid_ceph_origins:
# - repository
# - distro
# - local
#ceph_repository: dummy
#valid_ceph_repository:
# - community
# - dev
# - uca
# - custom
# - obs
# REPOSITORY: COMMUNITY VERSION
#
# Enabled when ceph_repository == 'community'
#
#ceph_mirror: https://download.ceph.com
#ceph_stable_key: https://download.ceph.com/keys/release.asc
#ceph_stable_release: squid
#ceph_stable_repo: "{{ ceph_mirror }}/debian-{{ ceph_stable_release }}"
#nfs_ganesha_stable: true # use stable repos for nfs-ganesha
#centos_release_nfs: centos-release-nfs-ganesha4
#nfs_ganesha_stable_deb_repo: http://ppa.launchpad.net/nfs-ganesha/nfs-ganesha-4/ubuntu
#nfs_ganesha_apt_keyserver: keyserver.ubuntu.com
#nfs_ganesha_apt_key_id: EA914D611053D07BD332E18010353E8834DC57CA
#libntirpc_stable_deb_repo: http://ppa.launchpad.net/nfs-ganesha/libntirpc-4/ubuntu
# Use the option below to specify your applicable package tree, eg. when using non-LTS Ubuntu versions
# # for a list of available Debian distributions, visit http://download.ceph.com/debian-{{ ceph_stable_release }}/dists/
# for more info read: https://github.com/ceph/ceph-ansible/issues/305
# ceph_stable_distro_source: "{{ ansible_facts['distribution_release'] }}"
# REPOSITORY: UBUNTU CLOUD ARCHIVE
#
# Enabled when ceph_repository == 'uca'
#
# This allows the install of Ceph from the Ubuntu Cloud Archive. The Ubuntu Cloud Archive
# usually has newer Ceph releases than the normal distro repository.
#
#
#ceph_stable_repo_uca: "http://ubuntu-cloud.archive.canonical.com/ubuntu"
#ceph_stable_openstack_release_uca: queens
#ceph_stable_release_uca: "{{ ansible_facts['distribution_release'] }}-updates/{{ ceph_stable_openstack_release_uca }}"
# REPOSITORY: openSUSE OBS
#
# Enabled when ceph_repository == 'obs'
#
# This allows the install of Ceph from the openSUSE OBS repository. The OBS repository
# usually has newer Ceph releases than the normal distro repository.
#
#
#ceph_obs_repo: "https://download.opensuse.org/repositories/filesystems:/ceph:/{{ ceph_stable_release }}/openSUSE_Leap_{{ ansible_facts['distribution_version'] }}/"
# REPOSITORY: DEV
#
# Enabled when ceph_repository == 'dev'
#
#ceph_dev_branch: main # development branch you would like to use e.g: main, wip-hack
#ceph_dev_sha1: latest # distinct sha1 to use, defaults to 'latest' (as in latest built)
#nfs_ganesha_dev: false # use development repos for nfs-ganesha
# Set this to choose the version of ceph dev libraries used in the nfs-ganesha packages from shaman
# flavors so far include: ceph_main, ceph_jewel, ceph_kraken, ceph_luminous
#nfs_ganesha_flavor: "ceph_main"
# REPOSITORY: CUSTOM
#
# Enabled when ceph_repository == 'custom'
#
# Use a custom repository to install ceph. For RPM, ceph_custom_repo should be
# a URL to the .repo file to be installed on the targets. For deb,
# ceph_custom_repo should be the URL to the repo base.
#
# ceph_custom_key: https://server.domain.com/ceph-custom-repo/key.asc
#ceph_custom_repo: https://server.domain.com/ceph-custom-repo
# ORIGIN: LOCAL CEPH INSTALLATION
#
# Enabled when ceph_repository == 'local'
#
# Path to DESTDIR of the ceph install
# ceph_installation_dir: "/path/to/ceph_installation/"
# Whether or not to use installer script rundep_installer.sh
# This script takes in rundep and installs the packages line by line onto the machine
# If this is set to false then it is assumed that the machine ceph is being copied onto will already have
# all runtime dependencies installed
# use_installer: false
# Root directory for ceph-ansible
# ansible_dir: "/path/to/ceph-ansible"
######################
# CEPH CONFIGURATION #
######################
## Ceph options
#
# Each cluster requires a unique, consistent filesystem ID. By
# default, the playbook generates one for you.
# If you want to customize how the fsid is
# generated, you may find it useful to disable fsid generation to
# avoid cluttering up your ansible repo. If you set `generate_fsid` to
# false, you *must* generate `fsid` in another way.
# ACTIVATE THE FSID VARIABLE FOR NON-VAGRANT DEPLOYMENT
#fsid: "{{ cluster_uuid.stdout }}"
#generate_fsid: true
#ceph_conf_key_directory: /etc/ceph
#ceph_uid: "{{ '64045' if not containerized_deployment | bool and ansible_facts['os_family'] == 'Debian' else '167' }}"
# Permissions for keyring files in /etc/ceph
#ceph_keyring_permissions: '0600'
#cephx: true
# Cluster configuration
#ceph_cluster_conf:
# global:
# public_network: "{{ public_network | default(omit) }}"
# cluster_network: "{{ cluster_network | default(omit) }}"
# osd_pool_default_crush_rule: "{{ osd_pool_default_crush_rule }}"
# ms_bind_ipv6: "{{ (ip_version == 'ipv6') | string }}"
# ms_bind_ipv4: "{{ (ip_version == 'ipv4') | string }}"
# osd_crush_chooseleaf_type: "{{ '0' if common_single_host_mode | default(false) else omit }}"
## Client options
#
#rbd_cache: "true"
#rbd_cache_writethrough_until_flush: "true"
#rbd_concurrent_management_ops: 20
#rbd_client_directories: true # this will create rbd_client_log_path and rbd_client_admin_socket_path directories with proper permissions
# Permissions for the rbd_client_log_path and
# rbd_client_admin_socket_path. Depending on your use case for Ceph
# you may want to change these values. The default, which is used if
# any of the variables are unset or set to a false value (like `null`
# or `false`) is to automatically determine what is appropriate for
# the Ceph version with non-OpenStack workloads -- ceph:ceph and 0770
# for infernalis releases, and root:root and 1777 for pre-infernalis
# releases.
#
# For other use cases, including running Ceph with OpenStack, you'll
# want to set these differently:
#
# For OpenStack on RHEL, you'll want:
# rbd_client_directory_owner: "qemu"
# rbd_client_directory_group: "libvirtd" (or "libvirt", depending on your version of libvirt)
# rbd_client_directory_mode: "0755"
#
# For OpenStack on Ubuntu or Debian, set:
# rbd_client_directory_owner: "libvirt-qemu"
# rbd_client_directory_group: "kvm"
# rbd_client_directory_mode: "0755"
#
# If you set rbd_client_directory_mode, you must use a string (e.g.,
# 'rbd_client_directory_mode: "0755"', *not*
# 'rbd_client_directory_mode: 0755', or Ansible will complain: mode
# must be in octal or symbolic form
#rbd_client_directory_owner: ceph
#rbd_client_directory_group: ceph
#rbd_client_directory_mode: "0755"
#rbd_client_log_path: /var/log/ceph
#rbd_client_log_file: "{{ rbd_client_log_path }}/qemu-guest-$pid.log" # must be writable by QEMU and allowed by SELinux or AppArmor
#rbd_client_admin_socket_path: /var/run/ceph # must be writable by QEMU and allowed by SELinux or AppArmor
## Monitor options
# set to either ipv4 or ipv6, whichever your network is using
#ip_version: ipv4
#mon_host_v1:
# enabled: true
# suffix: ':6789'
#mon_host_v2:
# suffix: ':3300'
#enable_ceph_volume_debug: false
##########
# CEPHFS #
##########
# When pg_autoscale_mode is set to True, you must add the target_size_ratio key with a correct value
# `pg_num` and `pgp_num` keys will be ignored, even if specified.
# eg:
# cephfs_data_pool:
# name: "{{ cephfs_data if cephfs_data is defined else 'cephfs_data' }}"
# target_size_ratio: 0.2
#cephfs: cephfs # name of the ceph filesystem
#cephfs_data_pool:
# name: "{{ cephfs_data if cephfs_data is defined else 'cephfs_data' }}"
#cephfs_metadata_pool:
# name: "{{ cephfs_metadata if cephfs_metadata is defined else 'cephfs_metadata' }}"
#cephfs_pools:
# - "{{ cephfs_data_pool }}"
# - "{{ cephfs_metadata_pool }}"
## OSD options
#
#lvmetad_disabled: false
#is_hci: false
#hci_safety_factor: 0.2
#non_hci_safety_factor: 0.7
#safety_factor: "{{ hci_safety_factor if is_hci | bool else non_hci_safety_factor }}"
#osd_memory_target: 4294967296
#journal_size: 5120 # OSD journal size in MB
#block_db_size: -1 # block db size in bytes for the ceph-volume lvm batch. -1 means use the default of 'as big as possible'.
#public_network: 0.0.0.0/0
#cluster_network: "{{ public_network | regex_replace(' ', '') }}"
#osd_mkfs_type: xfs
#osd_mkfs_options_xfs: -f -i size=2048
#osd_mount_options_xfs: noatime,largeio,inode64,swalloc
#osd_objectstore: bluestore
# Any device containing these patterns in their path will be excluded.
#osd_auto_discovery_exclude: "dm-*|loop*|md*|rbd*"
## MDS options
#
#mds_max_mds: 1
## Rados Gateway options
#
#radosgw_frontend_type: beast # For additional frontends see: https://docs.ceph.com/en/latest/radosgw/frontends/
#radosgw_frontend_port: 8080
# The server private key, public certificate and any other CA or intermediate certificates should be in one file, in PEM format.
#radosgw_frontend_ssl_certificate: ""
#radosgw_frontend_ssl_certificate_data: "" # certificate contents to be written to path defined by radosgw_frontend_ssl_certificate
#radosgw_frontend_options: ""
#radosgw_thread_pool_size: 512
# You must define either radosgw_interface, radosgw_address.
# These variables must be defined at least in all.yml and overrided if needed (inventory host file or group_vars/*.yml).
# Eg. If you want to specify for each radosgw node which address the radosgw will bind to you can set it in your **inventory host file** by using 'radosgw_address' variable.
# Preference will go to radosgw_address if both radosgw_address and radosgw_interface are defined.
#radosgw_interface: interface
#radosgw_address: x.x.x.x
#radosgw_address_block: subnet
#radosgw_keystone_ssl: false # activate this when using keystone PKI keys
#radosgw_num_instances: 1
#rgw_zone: default # This is used for rgw instance client names.
## Testing mode
# enable this mode _only_ when you have a single node
# if you don't want it keep the option commented
# common_single_host_mode: true
## Handlers - restarting daemons after a config change
# if for whatever reasons the content of your ceph configuration changes
# ceph daemons will be restarted as well. At the moment, we can not detect
# which config option changed so all the daemons will be restarted. Although
# this restart will be serialized for each node, in between a health check
# will be performed so we make sure we don't move to the next node until
# ceph is not healthy
# Obviously between the checks (for monitors to be in quorum and for osd's pgs
# to be clean) we have to wait. These retries and delays can be configurable
# for both monitors and osds.
#
# Monitor handler checks
#handler_health_mon_check_retries: 10
#handler_health_mon_check_delay: 20
#
# OSD handler checks
#handler_health_osd_check_retries: 40
#handler_health_osd_check_delay: 30
#handler_health_osd_check: true
#
# MDS handler checks
#handler_health_mds_check_retries: 5
#handler_health_mds_check_delay: 10
#
# RGW handler checks
#handler_health_rgw_check_retries: 5
#handler_health_rgw_check_delay: 10
#handler_rgw_use_haproxy_maintenance: false
# NFS handler checks
#handler_health_nfs_check_retries: 5
#handler_health_nfs_check_delay: 10
# RBD MIRROR handler checks
#handler_health_rbd_mirror_check_retries: 5
#handler_health_rbd_mirror_check_delay: 10
# MGR handler checks
#handler_health_mgr_check_retries: 5
#handler_health_mgr_check_delay: 10
## health mon/osds check retries/delay:
#health_mon_check_retries: 20
#health_mon_check_delay: 10
#health_osd_check_retries: 20
#health_osd_check_delay: 10
##############
# RBD-MIRROR #
##############
#ceph_rbd_mirror_pool: "rbd"
###############
# NFS-GANESHA #
###############
#
# Access type options
#
# Enable NFS File access
# If set to true, then ganesha is set up to export the root of the
# Ceph filesystem, and ganesha's attribute and directory caching is disabled
# as much as possible since libcephfs clients also caches the same
# information.
#
# Set this to true to enable File access via NFS. Requires an MDS role.
#nfs_file_gw: false
# Set this to true to enable Object access via NFS. Requires an RGW role.
#nfs_obj_gw: "{{ False if groups.get(mon_group_name, []) | length == 0 else True }}"
###################
# CONFIG OVERRIDE #
###################
# Ceph configuration file override.
# This allows you to specify more configuration options
# using an INI style format.
#
# When configuring RGWs, make sure you use the form [client.rgw.*]
# instead of [client.radosgw.*].
# For more examples check the profiles directory of https://github.com/ceph/ceph-ansible.
#
# The following sections are supported: [global], [mon], [osd], [mds], [client]
#
# Example:
# ceph_conf_overrides:
# global:
# foo: 1234
# bar: 5678
# "client.rgw.{{ rgw_zone }}.{{ hostvars[groups.get(rgw_group_name)[0]]['ansible_facts']['hostname'] }}":
# rgw_zone: zone1
#
#ceph_conf_overrides: {}
#############
# OS TUNING #
#############
#disable_transparent_hugepage: "{{ false if osd_objectstore == 'bluestore' }}"
#os_tuning_params:
# - { name: fs.file-max, value: 26234859 }
# - { name: vm.zone_reclaim_mode, value: 0 }
# - { name: vm.swappiness, value: 10 }
# - { name: vm.min_free_kbytes, value: "{{ vm_min_free_kbytes }}" }
# For Debian & Red Hat/CentOS installs set TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES
# Set this to a byte value (e.g. 134217728)
# A value of 0 will leave the package default.
#ceph_tcmalloc_max_total_thread_cache: 134217728
##########
# DOCKER #
##########
#ceph_docker_image: "ceph/ceph"
#ceph_docker_image_tag: v19
#ceph_docker_registry: quay.io
#ceph_docker_registry_auth: false
# ceph_docker_registry_username:
# ceph_docker_registry_password:
# ceph_docker_http_proxy:
# ceph_docker_https_proxy:
#ceph_docker_no_proxy: "localhost,127.0.0.1"
## Client only docker image - defaults to {{ ceph_docker_image }}
#ceph_client_docker_image: "{{ ceph_docker_image }}"
#ceph_client_docker_image_tag: "{{ ceph_docker_image_tag }}"
#ceph_client_docker_registry: "{{ ceph_docker_registry }}"
#containerized_deployment: false
#container_binary:
#timeout_command: "{{ 'timeout --foreground -s KILL ' ~ docker_pull_timeout if (docker_pull_timeout != '0') and (ceph_docker_dev_image is undefined or not ceph_docker_dev_image) else '' }}"
#ceph_common_container_params:
# envs:
# NODE_NAME: "{{ ansible_facts['hostname'] }}"
# CONTAINER_IMAGE: "{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}"
# TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES: "{{ ceph_tcmalloc_max_total_thread_cache }}"
# args:
# - --setuser=ceph
# - --setgroup=ceph
# - --default-log-to-file=false
# - --default-log-to-stderr=true
# - --default-log-stderr-prefix="debug "
# volumes:
# - /var/lib/ceph/crash:/var/lib/ceph/crash:z
# - /var/run/ceph:/var/run/ceph:z
# - /var/log/ceph:/var/log/ceph:z
# - /etc/ceph:/etc/ceph:z
# - /etc/localtime:/etc/localtime:ro
# this is only here for usage with the rolling_update.yml playbook
# do not ever change this here
#rolling_update: false
#####################
# Docker pull retry #
#####################
#docker_pull_retry: 3
#docker_pull_timeout: "300s"
#############
# DASHBOARD #
#############
#dashboard_enabled: true
# Choose http or https
# For https, you should set dashboard.crt/key and grafana.crt/key
# If you define the dashboard_crt and dashboard_key variables, but leave them as '',
# then we will autogenerate a cert and keyfile
#dashboard_protocol: https
#dashboard_port: 8443
# set this variable to the network you want the dashboard to listen on. (Default to public_network)
#dashboard_network: "{{ public_network }}"
#dashboard_admin_user: admin
#dashboard_admin_user_ro: false
# This variable must be set with a strong custom password when dashboard_enabled is True
# dashboard_admin_password: p@ssw0rd
# We only need this for SSL (https) connections
#dashboard_crt: ''
#dashboard_key: ''
#dashboard_certificate_cn: ceph-dashboard
#dashboard_tls_external: false
#dashboard_grafana_api_no_ssl_verify: "{{ true if dashboard_protocol == 'https' and not grafana_crt and not grafana_key else false }}"
#dashboard_rgw_api_user_id: ceph-dashboard
#dashboard_rgw_api_admin_resource: ''
#dashboard_rgw_api_no_ssl_verify: false
#dashboard_frontend_vip: ''
#dashboard_disabled_features: []
#prometheus_frontend_vip: ''
#alertmanager_frontend_vip: ''
#node_exporter_container_image: "docker.io/prom/node-exporter:v0.17.0"
#node_exporter_port: 9100
#grafana_admin_user: admin
# This variable must be set with a strong custom password when dashboard_enabled is True
# grafana_admin_password: admin
# We only need this for SSL (https) connections
#grafana_crt: ''
#grafana_key: ''
# When using https, please fill with a hostname for which grafana_crt is valid.
#grafana_server_fqdn: ''
#grafana_container_image: "docker.io/grafana/grafana:6.7.4"
#grafana_container_cpu_period: 100000
#grafana_container_cpu_cores: 2
# container_memory is in GB
#grafana_container_memory: 4
#grafana_uid: 472
#grafana_datasource: Dashboard
#grafana_dashboards_path: "/etc/grafana/dashboards/ceph-dashboard"
#grafana_dashboard_version: main
#grafana_dashboard_files:
# - ceph-cluster.json
# - cephfs-overview.json
# - host-details.json
# - hosts-overview.json
# - osd-device-details.json
# - osds-overview.json
# - pool-detail.json
# - pool-overview.json
# - radosgw-detail.json
# - radosgw-overview.json
# - radosgw-sync-overview.json
# - rbd-details.json
# - rbd-overview.json
#grafana_plugins:
# - vonage-status-panel
# - grafana-piechart-panel
#grafana_allow_embedding: true
#grafana_port: 3000
#grafana_network: "{{ public_network }}"
#grafana_conf_overrides: {}
#prometheus_container_image: "docker.io/prom/prometheus:v2.7.2"
#prometheus_container_cpu_period: 100000
#prometheus_container_cpu_cores: 2
# container_memory is in GB
#prometheus_container_memory: 4
#prometheus_data_dir: /var/lib/prometheus
#prometheus_conf_dir: /etc/prometheus
#prometheus_user_id: '65534' # This is the UID used by the prom/prometheus container image
#prometheus_port: 9092
#prometheus_conf_overrides: {}
# Uncomment out this variable if you need to customize the retention period for prometheus storage.
# set it to '30d' if you want to retain 30 days of data.
# prometheus_storage_tsdb_retention_time: 15d
#alertmanager_container_image: "docker.io/prom/alertmanager:v0.16.2"
#alertmanager_container_cpu_period: 100000
#alertmanager_container_cpu_cores: 2
# container_memory is in GB
#alertmanager_container_memory: 4
#alertmanager_data_dir: /var/lib/alertmanager
#alertmanager_conf_dir: /etc/alertmanager
#alertmanager_port: 9093
#alertmanager_cluster_port: 9094
#alertmanager_conf_overrides: {}
#alertmanager_dashboard_api_no_ssl_verify: "{{ true if dashboard_protocol == 'https' and not dashboard_crt and not dashboard_key else false }}"
#no_log_on_ceph_key_tasks: true
###############
# DEPRECATION #
###############
######################################################
# VARIABLES BELOW SHOULD NOT BE MODIFIED BY THE USER #
# *DO NOT* MODIFY THEM #
######################################################
#container_exec_cmd:
#docker: false
#ceph_volume_debug: "{{ enable_ceph_volume_debug | ternary(1, 0) }}"

View File

@ -0,0 +1,50 @@
---
# Variables here are applicable to all host groups NOT roles
# This sample file generated by generate_group_vars_sample.sh
# Dummy variable to avoid error because ansible does not recognize the
# file as a good configuration file when no variable in it.
dummy:
###########
# GENERAL #
###########
# Even though Client nodes should not have the admin key
# at their disposal, some people might want to have it
# distributed on Client nodes. Setting 'copy_admin_key' to 'true'
# will copy the admin key to the /etc/ceph/ directory
#copy_admin_key: false
#user_config: false
# When pg_autoscale_mode is set to True, you must add the target_size_ratio key with a correct value
# `pg_num` and `pgp_num` keys will be ignored, even if specified.
# eg:
# test:
# name: "test"
# application: "rbd"
# target_size_ratio: 0.2
#test:
# name: "test"
# application: "rbd"
#test2:
# name: "test2"
# application: "rbd"
#pools:
# - "{{ test }}"
# - "{{ test2 }}"
# Generate a keyring using ceph-authtool CLI or python.
# Eg:
# $ ceph-authtool --gen-print-key
# or
# $ python2 -c "import os ; import struct ; import time; import base64 ; key = os.urandom(16) ; header = struct.pack('<hiih',1,int(time.time()),0,len(key)) ; print(base64.b64encode(header + key))"
#
# To use a particular secret, you have to add 'key' to the dict below, so something like:
# - { name: client.test, key: "AQAin8tUMICVFBAALRHNrV0Z4MXupRw4v9JQ6Q==" ...
#keys:
# - { name: client.test, caps: { mon: "profile rbd", osd: "allow class-read object_prefix rbd_children, profile rbd pool=test" }, mode: "{{ ceph_keyring_permissions }}" }
# - { name: client.test2, caps: { mon: "profile rbd", osd: "allow class-read object_prefix rbd_children, profile rbd pool=test2" }, mode: "{{ ceph_keyring_permissions }}" }

View File

@ -0,0 +1,33 @@
---
# Variables here are applicable to all host groups NOT roles
# This sample file generated by generate_group_vars_sample.sh
# Dummy variable to avoid error because ansible does not recognize the
# file as a good configuration file when no variable in it.
dummy:
###########
# GENERAL #
###########
#ceph_exporter_addr: "0.0.0.0"
#ceph_exporter_port: 9926
#ceph_exporter_stats_period: 5 # seconds
#ceph_exporter_prio_limit: 5
##########
# DOCKER #
##########
# If you want to add parameters, you should retain the existing ones and include the new ones.
#ceph_exporter_container_params:
# args:
# - -f
# - -n=client.ceph-exporter
# - --sock-dir=/var/run/ceph
# - --addrs={{ ceph_exporter_addr }}
# - --port={{ ceph_exporter_port }}
# - --stats-period={{ ceph_exporter_stats_period }}
# - --prio-limit={{ ceph_exporter_prio_limit }}

View File

@ -0,0 +1,52 @@
---
# Variables here are applicable to all host groups NOT roles
# This sample file generated by generate_group_vars_sample.sh
# Dummy variable to avoid error because ansible does not recognize the
# file as a good configuration file when no variable in it.
dummy:
# You can override vars by using host or group vars
###########
# GENERAL #
###########
# Even though MDS nodes should not have the admin key
# at their disposal, some people might want to have it
# distributed on MDS nodes. Setting 'copy_admin_key' to 'true'
# will copy the admin key to the /etc/ceph/ directory
#copy_admin_key: false
##########
# DOCKER #
##########
# Resource limitation
# For the whole list of limits you can apply see: docs.docker.com/engine/admin/resource_constraints
# Default values are based from: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2/html/red_hat_ceph_storage_hardware_guide/minimum_recommendations
#ceph_mds_docker_memory_limit: "{{ ansible_facts['memtotal_mb'] }}m"
#ceph_mds_docker_cpu_limit: 4
#ceph_config_keys: [] # DON'T TOUCH ME
# If you want to add parameters, you should retain the existing ones and include the new ones.
#ceph_mds_container_params:
# volumes:
# - /var/lib/ceph/bootstrap-mds:/var/lib/ceph/bootstrap-mds:z
# - /var/lib/ceph/mds/{{ cluster }}-{{ ansible_facts['hostname'] }}:/var/lib/ceph/mds/{{ cluster }}-{{ ansible_facts['hostname'] }}:z
# args:
# - -f
# - -i={{ ansible_facts['hostname'] }}
###########
# SYSTEMD #
###########
# ceph_mds_systemd_overrides will override the systemd settings
# for the ceph-mds services.
# For example,to set "PrivateDevices=false" you can specify:
# ceph_mds_systemd_overrides:
# Service:
# PrivateDevices: false

65
group_vars/mgrs.yml Normal file
View File

@ -0,0 +1,65 @@
---
# Variables here are applicable to all host groups NOT roles
# This sample file generated by generate_group_vars_sample.sh
# Dummy variable to avoid error because ansible does not recognize the
# file as a good configuration file when no variable in it.
dummy:
##########
# GLOBAL #
##########
# Even though MGR nodes should not have the admin key
# at their disposal, some people might want to have it
# distributed on MGR nodes. Setting 'copy_admin_key' to 'true'
# will copy the admin key to the /etc/ceph/ directory
copy_admin_key: false
mgr_secret: 'mgr_secret'
###########
# MODULES #
###########
# Ceph mgr modules to enable, to view the list of available modules see: http://docs.ceph.com/docs/CEPH_VERSION/mgr/
# and replace CEPH_VERSION with your current Ceph version, e,g: 'mimic'
#ceph_mgr_modules: []
############
# PACKAGES #
############
# Ceph mgr packages to install, ceph-mgr + extra module packages.
ceph_mgr_packages:
- ceph-mgr
##########
# DOCKER #
##########
# Resource limitation
# For the whole list of limits you can apply see: docs.docker.com/engine/admin/resource_constraints
# Default values are based from: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2/html/red_hat_ceph_storage_hardware_guide/minimum_recommendations
#ceph_mgr_docker_memory_limit: "{{ ansible_facts['memtotal_mb'] }}m"
#ceph_mgr_docker_cpu_limit: 1
#ceph_config_keys: [] # DON'T TOUCH ME
# If you want to add parameters, you should retain the existing ones and include the new ones.
#ceph_mgr_container_params:
# volumes:
# - /var/lib/ceph/mgr:/var/lib/ceph/mgr:z,rshared
# - /var/lib/ceph/bootstrap-mgr:/var/lib/ceph/bootstrap-mgr:z
# args:
# - -f
# - -i={{ ansible_facts['hostname'] }}
###########
# SYSTEMD #
###########
# ceph_mgr_systemd_overrides will override the systemd settings
# for the ceph-mgr services.
# For example,to set "PrivateDevices=false" you can specify:
# ceph_mgr_systemd_overrides:
# Service:
# PrivateDevices: false

View File

@ -0,0 +1,66 @@
---
# Variables here are applicable to all host groups NOT roles
# This sample file generated by generate_group_vars_sample.sh
# Dummy variable to avoid error because ansible does not recognize the
# file as a good configuration file when no variable in it.
dummy:
##########
# GLOBAL #
##########
# Even though MGR nodes should not have the admin key
# at their disposal, some people might want to have it
# distributed on MGR nodes. Setting 'copy_admin_key' to 'true'
# will copy the admin key to the /etc/ceph/ directory
#copy_admin_key: false
#mgr_secret: 'mgr_secret'
###########
# MODULES #
###########
# Ceph mgr modules to enable, to view the list of available modules see: http://docs.ceph.com/docs/CEPH_VERSION/mgr/
# and replace CEPH_VERSION with your current Ceph version, e,g: 'mimic'
#ceph_mgr_modules: []
############
# PACKAGES #
############
# Ceph mgr packages to install, ceph-mgr + extra module packages.
#ceph_mgr_packages:
# - ceph-mgr
##########
# DOCKER #
##########
# Resource limitation
# For the whole list of limits you can apply see: docs.docker.com/engine/admin/resource_constraints
# Default values are based from: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2/html/red_hat_ceph_storage_hardware_guide/minimum_recommendations
#ceph_mgr_docker_memory_limit: "{{ ansible_facts['memtotal_mb'] }}m"
#ceph_mgr_docker_cpu_limit: 1
#ceph_config_keys: [] # DON'T TOUCH ME
# If you want to add parameters, you should retain the existing ones and include the new ones.
#ceph_mgr_container_params:
# volumes:
# - /var/lib/ceph/mgr:/var/lib/ceph/mgr:z,rshared
# - /var/lib/ceph/bootstrap-mgr:/var/lib/ceph/bootstrap-mgr:z
# args:
# - -f
# - -i={{ ansible_facts['hostname'] }}
###########
# SYSTEMD #
###########
# ceph_mgr_systemd_overrides will override the systemd settings
# for the ceph-mgr services.
# For example,to set "PrivateDevices=false" you can specify:
# ceph_mgr_systemd_overrides:
# Service:
# PrivateDevices: false

77
group_vars/mons.yml Normal file
View File

@ -0,0 +1,77 @@
---
# Variables here are applicable to all host groups NOT roles
# This sample file generated by generate_group_vars_sample.sh
# Dummy variable to avoid error because ansible does not recognize the
# file as a good configuration file when no variable in it.
dummy:
# You can override vars by using host or group vars
###########
# GENERAL #
###########
mon_group_name: mons
# ACTIVATE BOTH FSID AND MONITOR_SECRET VARIABLES FOR NON-VAGRANT DEPLOYMENT
monitor_secret: "{{ monitor_keyring.stdout }}"
admin_secret: 'admin_secret'
# Secure your cluster
# This will set the following flags on all the pools:
# * nosizechange
# * nopgchange
# * nodelete
#secure_cluster: false
#secure_cluster_flags:
# - nopgchange
# - nodelete
# - nosizechange
client_admin_ceph_authtool_cap:
mon: allow *
osd: allow *
mds: allow *
mgr: allow *
##########
# DOCKER #
##########
# Resource limitation
# For the whole list of limits you can apply see: docs.docker.com/engine/admin/resource_constraints
# Default values are based from: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2/html/red_hat_ceph_storage_hardware_guide/minimum_recommendations
#ceph_mon_docker_memory_limit: "{{ ansible_facts['memtotal_mb'] }}m"
#ceph_mon_docker_cpu_limit: 1
#ceph_mon_container_listen_port: 3300
# Use this variable to modify the configuration to run your mon container.
#mon_docker_privileged: false
#mon_docker_net_host: true
#ceph_config_keys: [] # DON'T TOUCH ME
# If you want to add parameters, you should retain the existing ones and include the new ones.
#ceph_mon_container_params:
# volumes:
# - /var/lib/ceph/mon:/var/lib/ceph/mon:z,rshared
# args:
# - -f
# - --default-mon-cluster-log-to-file=false
# - --default-mon-cluster-log-to-stderr=true
# - -i={{ monitor_name }}
# - --mon-data=/var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }}
# - --public-addr={{ _monitor_addresses[inventory_hostname] }}
# - --mon-initial-members={{ groups[mon_group_name] | map('extract', hostvars, 'ansible_facts') | map(attribute='hostname') | join(',') }}
###########
# SYSTEMD #
###########
# ceph_mon_systemd_overrides will override the systemd settings
# for the ceph-mon services.
# For example,to set "PrivateDevices=false" you can specify:
# ceph_mon_systemd_overrides:
# Service:
# PrivateDevices: false

View File

@ -0,0 +1,78 @@
---
# Variables here are applicable to all host groups NOT roles
# This sample file generated by generate_group_vars_sample.sh
# Dummy variable to avoid error because ansible does not recognize the
# file as a good configuration file when no variable in it.
dummy:
# You can override vars by using host or group vars
###########
# GENERAL #
###########
#mon_group_name: mons
# ACTIVATE BOTH FSID AND MONITOR_SECRET VARIABLES FOR NON-VAGRANT DEPLOYMENT
#monitor_secret: "{{ monitor_keyring.stdout }}"
#admin_secret: 'admin_secret'
# Secure your cluster
# This will set the following flags on all the pools:
# * nosizechange
# * nopgchange
# * nodelete
#secure_cluster: false
#secure_cluster_flags:
# - nopgchange
# - nodelete
# - nosizechange
#client_admin_ceph_authtool_cap:
# mon: allow *
# osd: allow *
# mds: allow *
# mgr: allow *
##########
# DOCKER #
##########
# Resource limitation
# For the whole list of limits you can apply see: docs.docker.com/engine/admin/resource_constraints
# Default values are based from: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2/html/red_hat_ceph_storage_hardware_guide/minimum_recommendations
#ceph_mon_docker_memory_limit: "{{ ansible_facts['memtotal_mb'] }}m"
#ceph_mon_docker_cpu_limit: 1
#ceph_mon_container_listen_port: 3300
# Use this variable to modify the configuration to run your mon container.
#mon_docker_privileged: false
#mon_docker_net_host: true
#ceph_config_keys: [] # DON'T TOUCH ME
# If you want to add parameters, you should retain the existing ones and include the new ones.
#ceph_mon_container_params:
# volumes:
# - /var/lib/ceph/mon:/var/lib/ceph/mon:z,rshared
# args:
# - -f
# - --default-mon-cluster-log-to-file=false
# - --default-mon-cluster-log-to-stderr=true
# - -i={{ monitor_name }}
# - --mon-data=/var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }}
# - --public-addr={{ _monitor_addresses[inventory_hostname] }}
# - --mon-initial-members={{ groups[mon_group_name] | map('extract', hostvars, 'ansible_facts') | map(attribute='hostname') | join(',') }}
###########
# SYSTEMD #
###########
# ceph_mon_systemd_overrides will override the systemd settings
# for the ceph-mon services.
# For example,to set "PrivateDevices=false" you can specify:
# ceph_mon_systemd_overrides:
# Service:
# PrivateDevices: false

131
group_vars/nfss.yml.sample Normal file
View File

@ -0,0 +1,131 @@
---
# Variables here are applicable to all host groups NOT roles
# This sample file generated by generate_group_vars_sample.sh
# Dummy variable to avoid error because ansible does not recognize the
# file as a good configuration file when no variable in it.
dummy:
# You can override vars by using host or group vars
###########
# GENERAL #
###########
# Even though NFS nodes should not have the admin key
# at their disposal, some people might want to have it
# distributed on RGW nodes. Setting 'copy_admin_key' to 'true'
# will copy the admin key to the /etc/ceph/ directory
#copy_admin_key: false
# Whether docker container or systemd service should be enabled
# and started, it's useful to set it to false if nfs-ganesha
# service is managed by pacemaker
#ceph_nfs_enable_service: true
# ceph-nfs systemd service uses ansible's hostname as an instance id,
# so service name is ceph-nfs@{{ ansible_facts['hostname'] }}, this is not
# ideal when ceph-nfs is managed by pacemaker across multiple hosts - in
# such case it's better to have constant instance id instead which
# can be set by 'ceph_nfs_service_suffix'
# ceph_nfs_service_suffix: "{{ ansible_facts['hostname'] }}"
######################
# NFS Ganesha Config #
######################
#ceph_nfs_log_file: "/var/log/ganesha/ganesha.log"
#ceph_nfs_dynamic_exports: false
# If set to true then rados is used to store ganesha exports
# and client sessions information, this is useful if you
# run multiple nfs-ganesha servers in active/passive mode and
# want to do failover
#ceph_nfs_rados_backend: false
# Name of the rados object used to store a list of the export rados
# object URLS
#ceph_nfs_rados_export_index: "ganesha-export-index"
# Address ganesha service should listen on, by default ganesha listens on all
# addresses. (Note: ganesha ignores this parameter in current version due to
# this bug: https://github.com/nfs-ganesha/nfs-ganesha/issues/217)
# ceph_nfs_bind_addr: 0.0.0.0
# If set to true, then ganesha's attribute and directory caching is disabled
# as much as possible. Currently, ganesha caches by default.
# When using ganesha as CephFS's gateway, it is recommended to turn off
# ganesha's caching as the libcephfs clients also cache the same information.
# Note: Irrespective of this option's setting, ganesha's caching is disabled
# when setting 'nfs_file_gw' option as true.
#ceph_nfs_disable_caching: false
# This is the file ganesha will use to control NFSv4 ID mapping
#ceph_nfs_idmap_conf: "/etc/ganesha/idmap.conf"
# idmap configuration file override.
# This allows you to specify more configuration options
# using an INI style format.
# Example:
# idmap_conf_overrides:
# General:
# Domain: foo.domain.net
#idmap_conf_overrides: {}
####################
# FSAL Ceph Config #
####################
#ceph_nfs_ceph_export_id: 20133
#ceph_nfs_ceph_pseudo_path: "/cephfile"
#ceph_nfs_ceph_protocols: "3,4"
#ceph_nfs_ceph_access_type: "RW"
#ceph_nfs_ceph_user: "admin"
#ceph_nfs_ceph_squash: "Root_Squash"
#ceph_nfs_ceph_sectype: "sys,krb5,krb5i,krb5p"
###################
# FSAL RGW Config #
###################
#ceph_nfs_rgw_export_id: 20134
#ceph_nfs_rgw_pseudo_path: "/cephobject"
#ceph_nfs_rgw_protocols: "3,4"
#ceph_nfs_rgw_access_type: "RW"
#ceph_nfs_rgw_user: "cephnfs"
#ceph_nfs_rgw_squash: "Root_Squash"
#ceph_nfs_rgw_sectype: "sys,krb5,krb5i,krb5p"
# Note: keys are optional and can be generated, but not on containerized, where
# they must be configered.
# ceph_nfs_rgw_access_key: "QFAMEDSJP5DEKJO0DDXY"
# ceph_nfs_rgw_secret_key: "iaSFLDVvDdQt6lkNzHyW4fPLZugBAI1g17LO0+87[MAC[M#C"
#rgw_client_name: client.rgw.{{ ansible_facts['hostname'] }}
###################
# CONFIG OVERRIDE #
###################
# Ganesha configuration file override.
# These multiline strings will be appended to the contents of the blocks in ganesha.conf and
# must be in the correct ganesha.conf format seen here:
# https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/config_samples/ganesha.conf.example
#
# Example:
# CACHEINODE {
# # Entries_HWMark = 100000;
# }
#
# ganesha_core_param_overrides:
# ganesha_ceph_export_overrides:
# ganesha_rgw_export_overrides:
# ganesha_rgw_section_overrides:
# ganesha_log_overrides:
# ganesha_conf_overrides: |
# CACHEINODE {
# # Entries_HWMark = 100000;
# }
##########
# DOCKER #
##########
#ceph_docker_image: "ceph/daemon"
#ceph_docker_image_tag: latest
#ceph_nfs_docker_extra_env:
#ceph_config_keys: [] # DON'T TOUCH ME

225
group_vars/osds.yml Normal file
View File

@ -0,0 +1,225 @@
---
# Variables here are applicable to all host groups NOT roles
# This sample file generated by generate_group_vars_sample.sh
# Dummy variable to avoid error because ansible does not recognize the
# file as a good configuration file when no variable in it.
dummy:
###########
# GENERAL #
###########
# Even though OSD nodes should not have the admin key
# at their disposal, some people might want to have it
# distributed on OSD nodes. Setting 'copy_admin_key' to 'true'
# will copy the admin key to the /etc/ceph/ directory
copy_admin_key: false
##############
# CEPH OPTIONS
##############
# Devices to be used as OSDs
# You can pre-provision disks that are not present yet.
# Ansible will just skip them. Newly added disk will be
# automatically configured during the next run.
#
# Declare devices to be used as OSDs
# All scenario(except 3rd) inherit from the following device declaration
# Note: This scenario uses the ceph-volume lvm batch method to provision OSDs
devices:
- /dev/sdb
# - /dev/sdd
# - /dev/sde
#devices: []
# Declare devices to be used as block.db devices
# dedicated_devices:
# - /dev/sdx
# - /dev/sdy
#dedicated_devices: []
# Declare devices to be used as block.wal devices
# bluestore_wal_devices:
# - /dev/nvme0n1
# - /dev/nvme0n2
#bluestore_wal_devices: []
# 'osd_auto_discovery' mode prevents you from filling out the 'devices' variable above.
# Device discovery is based on the Ansible fact 'ansible_facts["devices"]'
# which reports all the devices on a system. If chosen, all the disks
# found will be passed to ceph-volume lvm batch. You should not be worried on using
# this option since ceph-volume has a built-in check which looks for empty devices.
# Thus devices with existing partition tables will not be used.
#
#osd_auto_discovery: false
# Encrypt your OSD device using dmcrypt
# If set to True, no matter which osd_objecstore you use the data will be encrypted
#dmcrypt: false
# Use ceph-volume to create OSDs from logical volumes.
# lvm_volumes is a list of dictionaries.
#
# Filestore: Each dictionary must contain a data, journal and vg_name key. Any
# logical volume or logical group used must be a name and not a path. data
# can be a logical volume, device or partition. journal can be either a lv or partition.
# You can not use the same journal for many data lvs.
# data_vg must be the volume group name of the data lv, only applicable when data is an lv.
# journal_vg is optional and must be the volume group name of the journal lv, if applicable.
# For example:
# lvm_volumes:
# - data: data-lv1
# data_vg: vg1
# journal: journal-lv1
# journal_vg: vg2
# crush_device_class: foo
# - data: data-lv2
# journal: /dev/sda1
# data_vg: vg1
# - data: data-lv3
# journal: /dev/sdb1
# data_vg: vg2
# - data: /dev/sda
# journal: /dev/sdb1
# - data: /dev/sda1
# journal: /dev/sdb1
#
# Bluestore: Each dictionary must contain at least data. When defining wal or
# db, it must have both the lv name and vg group (db and wal are not required).
# This allows for four combinations: just data, data and wal, data and wal and
# db, data and db.
# For example:
# lvm_volumes:
# - data: data-lv1
# data_vg: vg1
# wal: wal-lv1
# wal_vg: vg1
# crush_device_class: foo
# - data: data-lv2
# db: db-lv2
# db_vg: vg2
# - data: data-lv3
# wal: wal-lv1
# wal_vg: vg3
# db: db-lv3
# db_vg: vg3
# - data: data-lv4
# data_vg: vg4
# - data: /dev/sda
# - data: /dev/sdb1
#lvm_volumes: []
#crush_device_class: ""
#osds_per_device: 1
###############
# CRUSH RULES #
###############
#crush_rule_config: false
#crush_rule_hdd:
# name: HDD
# root: default
# type: host
# class: hdd
# default: false
#crush_rule_ssd:
# name: SSD
# root: default
# type: host
# class: ssd
# default: false
#crush_rules:
# - "{{ crush_rule_hdd }}"
# - "{{ crush_rule_ssd }}"
#ceph_ec_profiles: {}
# Caution: this will create crush roots and racks according to hostvars {{ osd_crush_location }}
# and will move hosts into them which might lead to significant data movement in the cluster!
#
# In order for the playbook to create CRUSH hierarchy, you have to setup your Ansible inventory file like so:
#
# [osds]
# ceph-osd-01 osd_crush_location="{ 'root': 'mon-roottt', 'rack': 'mon-rackkkk', 'pod': 'monpod', 'host': 'ceph-osd-01' }"
#
# Note that 'host' is mandatory and that you need to submit at least two bucket type (including the host)
#create_crush_tree: false
##########
# DOCKER #
##########
#ceph_config_keys: [] # DON'T TOUCH ME
# Resource limitation
# For the whole list of limits you can apply see: docs.docker.com/engine/admin/resource_constraints
# Default values are based from: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2/html/red_hat_ceph_storage_hardware_guide/minimum_recommendations
#ceph_osd_docker_memory_limit: "{{ ansible_facts['memtotal_mb'] }}m"
#ceph_osd_docker_cpu_limit: 4
# The next two variables are undefined, and thus, unused by default.
# If `lscpu | grep NUMA` returned the following:
# NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16
# NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17
# then, the following would run the OSD on the first NUMA node only.
# ceph_osd_docker_cpuset_cpus: "0,2,4,6,8,10,12,14,16"
# ceph_osd_docker_cpuset_mems: "0"
# PREPARE DEVICE
#
# WARNING /!\ DMCRYPT scenario ONLY works with Docker version 1.12.5 and above
#
#ceph_osd_docker_devices: "{{ devices }}"
#ceph_osd_docker_prepare_env: -e OSD_JOURNAL_SIZE={{ journal_size }}
# ACTIVATE DEVICE
#
#ceph_osd_numactl_opts: ""
# If you want to add parameters, you should retain the existing ones and include the new ones.
#ceph_osd_container_params:
# volumes:
# - /dev:/dev
# - /var/lib/ceph/bootstrap-osd/ceph.keyring:/var/lib/ceph/bootstrap-osd/ceph.keyring:z
# - /var/lib/ceph/osd/{{ cluster }}-"${OSD_ID}":/var/lib/ceph/osd/{{ cluster }}-"${OSD_ID}":z
# - /var/run/udev/:/var/run/udev/
# - /run/lvm/:/run/lvm/
# envs:
# OSD_ID: ${OSD_ID}
# args:
# - -f
# - -i=${OSD_ID}
###########
# SYSTEMD #
###########
# ceph_osd_systemd_overrides will override the systemd settings
# for the ceph-osd services.
# For example,to set "PrivateDevices=false" you can specify:
# ceph_osd_systemd_overrides:
# Service:
# PrivateDevices: false
###########
# CHECK #
###########
#nb_retry_wait_osd_up: 60
#delay_wait_osd_up: 10

View File

@ -0,0 +1,227 @@
---
# Variables here are applicable to all host groups NOT roles
# This sample file generated by generate_group_vars_sample.sh
# Dummy variable to avoid error because ansible does not recognize the
# file as a good configuration file when no variable in it.
dummy:
###########
# GENERAL #
###########
# Even though OSD nodes should not have the admin key
# at their disposal, some people might want to have it
# distributed on OSD nodes. Setting 'copy_admin_key' to 'true'
# will copy the admin key to the /etc/ceph/ directory
#copy_admin_key: false
##############
# CEPH OPTIONS
##############
# Devices to be used as OSDs
# You can pre-provision disks that are not present yet.
# Ansible will just skip them. Newly added disk will be
# automatically configured during the next run.
#
# Declare devices to be used as OSDs
# All scenario(except 3rd) inherit from the following device declaration
# Note: This scenario uses the ceph-volume lvm batch method to provision OSDs
# devices:
# - /dev/sdb
# - /dev/sdc
# - /dev/sdd
# - /dev/sde
#devices: []
# Declare devices to be used as block.db devices
# dedicated_devices:
# - /dev/sdx
# - /dev/sdy
#dedicated_devices: []
# Declare devices to be used as block.wal devices
# bluestore_wal_devices:
# - /dev/nvme0n1
# - /dev/nvme0n2
#bluestore_wal_devices: []
# 'osd_auto_discovery' mode prevents you from filling out the 'devices' variable above.
# Device discovery is based on the Ansible fact 'ansible_facts["devices"]'
# which reports all the devices on a system. If chosen, all the disks
# found will be passed to ceph-volume lvm batch. You should not be worried on using
# this option since ceph-volume has a built-in check which looks for empty devices.
# Thus devices with existing partition tables will not be used.
#
#osd_auto_discovery: false
# Encrypt your OSD device using dmcrypt
# If set to True, no matter which osd_objecstore you use the data will be encrypted
#dmcrypt: false
# Use ceph-volume to create OSDs from logical volumes.
# lvm_volumes is a list of dictionaries.
#
# Filestore: Each dictionary must contain a data, journal and vg_name key. Any
# logical volume or logical group used must be a name and not a path. data
# can be a logical volume, device or partition. journal can be either a lv or partition.
# You can not use the same journal for many data lvs.
# data_vg must be the volume group name of the data lv, only applicable when data is an lv.
# journal_vg is optional and must be the volume group name of the journal lv, if applicable.
# For example:
# lvm_volumes:
# - data: data-lv1
# data_vg: vg1
# journal: journal-lv1
# journal_vg: vg2
# crush_device_class: foo
# - data: data-lv2
# journal: /dev/sda1
# data_vg: vg1
# - data: data-lv3
# journal: /dev/sdb1
# data_vg: vg2
# - data: /dev/sda
# journal: /dev/sdb1
# - data: /dev/sda1
# journal: /dev/sdb1
#
# Bluestore: Each dictionary must contain at least data. When defining wal or
# db, it must have both the lv name and vg group (db and wal are not required).
# This allows for four combinations: just data, data and wal, data and wal and
# db, data and db.
# For example:
# lvm_volumes:
# - data: data-lv1
# data_vg: vg1
# wal: wal-lv1
# wal_vg: vg1
# crush_device_class: foo
# - data: data-lv2
# db: db-lv2
# db_vg: vg2
# - data: data-lv3
# wal: wal-lv1
# wal_vg: vg3
# db: db-lv3
# db_vg: vg3
# - data: data-lv4
# data_vg: vg4
# - data: /dev/sda
# - data: /dev/sdb1
#lvm_volumes: []
#crush_device_class: ""
#osds_per_device: 1
###############
# CRUSH RULES #
###############
#crush_rule_config: false
#crush_rule_hdd:
# name: HDD
# root: default
# type: host
# class: hdd
# default: false
#crush_rule_ssd:
# name: SSD
# root: default
# type: host
# class: ssd
# default: false
#crush_rules:
# - "{{ crush_rule_hdd }}"
# - "{{ crush_rule_ssd }}"
#ceph_ec_profiles: {}
# Caution: this will create crush roots and racks according to hostvars {{ osd_crush_location }}
# and will move hosts into them which might lead to significant data movement in the cluster!
#
# In order for the playbook to create CRUSH hierarchy, you have to setup your Ansible inventory file like so:
#
# [osds]
# ceph-osd-01 osd_crush_location="{ 'root': 'mon-roottt', 'rack': 'mon-rackkkk', 'pod': 'monpod', 'host': 'ceph-osd-01' }"
#
# Note that 'host' is mandatory and that you need to submit at least two bucket type (including the host)
#create_crush_tree: false
##########
# DOCKER #
##########
#ceph_config_keys: [] # DON'T TOUCH ME
# Resource limitation
# For the whole list of limits you can apply see: docs.docker.com/engine/admin/resource_constraints
# Default values are based from: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2/html/red_hat_ceph_storage_hardware_guide/minimum_recommendations
#ceph_osd_docker_memory_limit: "{{ ansible_facts['memtotal_mb'] }}m"
#ceph_osd_docker_cpu_limit: 4
# The next two variables are undefined, and thus, unused by default.
# If `lscpu | grep NUMA` returned the following:
# NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16
# NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17
# then, the following would run the OSD on the first NUMA node only.
# ceph_osd_docker_cpuset_cpus: "0,2,4,6,8,10,12,14,16"
# ceph_osd_docker_cpuset_mems: "0"
# PREPARE DEVICE
#
# WARNING /!\ DMCRYPT scenario ONLY works with Docker version 1.12.5 and above
#
#ceph_osd_docker_devices: "{{ devices }}"
#ceph_osd_docker_prepare_env: -e OSD_JOURNAL_SIZE={{ journal_size }}
# ACTIVATE DEVICE
#
#ceph_osd_numactl_opts: ""
# If you want to add parameters, you should retain the existing ones and include the new ones.
#ceph_osd_container_params:
# volumes:
# - /dev:/dev
# - /var/lib/ceph/bootstrap-osd/ceph.keyring:/var/lib/ceph/bootstrap-osd/ceph.keyring:z
# - /var/lib/ceph/osd/{{ cluster }}-"${OSD_ID}":/var/lib/ceph/osd/{{ cluster }}-"${OSD_ID}":z
# - /var/run/udev/:/var/run/udev/
# - /run/lvm/:/run/lvm/
# envs:
# OSD_ID: ${OSD_ID}
# args:
# - -f
# - -i=${OSD_ID}
###########
# SYSTEMD #
###########
# ceph_osd_systemd_overrides will override the systemd settings
# for the ceph-osd services.
# For example,to set "PrivateDevices=false" you can specify:
# ceph_osd_systemd_overrides:
# Service:
# PrivateDevices: false
###########
# CHECK #
###########
#nb_retry_wait_osd_up: 60
#delay_wait_osd_up: 10

View File

@ -0,0 +1,55 @@
---
# Variables here are applicable to all host groups NOT roles
# This sample file generated by generate_group_vars_sample.sh
# Dummy variable to avoid error because ansible does not recognize the
# file as a good configuration file when no variable in it.
dummy:
#########
# SETUP #
#########
# Even though rbd-mirror nodes should not have the admin key
# at their disposal, some people might want to have it
# distributed on rbd-mirror nodes. Setting 'copy_admin_key' to 'true'
# will copy the admin key to the /etc/ceph/ directory. Only
# valid for Luminous and later releases.
#copy_admin_key: false
#################
# CONFIGURATION #
#################
#ceph_rbd_mirror_local_user: client.rbd-mirror-peer
#ceph_rbd_mirror_configure: false
#ceph_rbd_mirror_mode: pool
#ceph_rbd_mirror_remote_cluster: remote
##########
# DOCKER #
##########
# Resource limitation
# For the whole list of limits you can apply see: docs.docker.com/engine/admin/resource_constraints
# Default values are based from: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2/html/red_hat_ceph_storage_hardware_guide/minimum_recommendations
# These options can be passed using the 'ceph_rbd_mirror_docker_extra_env' variable.
#ceph_rbd_mirror_docker_memory_limit: "{{ ansible_facts['memtotal_mb'] }}m"
#ceph_rbd_mirror_docker_cpu_limit: 1
#ceph_rbd_mirror_docker_extra_env:
#ceph_config_keys: [] # DON'T TOUCH ME
###########
# SYSTEMD #
###########
# ceph_rbd_mirror_systemd_overrides will override the systemd settings
# for the ceph-rbd-mirror services.
# For example,to set "PrivateDevices=false" you can specify:
# ceph_rbd_mirror_systemd_overrides:
# Service:
# PrivateDevices: false

View File

@ -0,0 +1,35 @@
---
# Variables here are applicable to all host groups NOT roles
# This sample file generated by generate_group_vars_sample.sh
# Dummy variable to avoid error because ansible does not recognize the
# file as a good configuration file when no variable in it.
dummy:
# You can override vars by using host or group vars
###########
# GENERAL #
###########
#haproxy_frontend_port: 80
#haproxy_frontend_ssl_port: 443
#haproxy_frontend_ssl_certificate:
#haproxy_ssl_dh_param: 4096
#haproxy_ssl_ciphers:
# - EECDH+AESGCM
# - EDH+AESGCM
#haproxy_ssl_options:
# - no-sslv3
# - no-tlsv10
# - no-tlsv11
# - no-tls-tickets
#
# virtual_ips:
# - 192.168.238.250
# - 192.168.238.251
#
# virtual_ip_netmask: 24
# virtual_ip_interface: ens33

106
group_vars/rgws.yml.sample Normal file
View File

@ -0,0 +1,106 @@
---
# Variables here are applicable to all host groups NOT roles
# This sample file generated by generate_group_vars_sample.sh
# Dummy variable to avoid error because ansible does not recognize the
# file as a good configuration file when no variable in it.
dummy:
# You can override vars by using host or group vars
###########
# GENERAL #
###########
# Even though RGW nodes should not have the admin key
# at their disposal, some people might want to have it
# distributed on RGW nodes. Setting 'copy_admin_key' to 'true'
# will copy the admin key to the /etc/ceph/ directory
#copy_admin_key: false
##########
# TUNING #
##########
# Declaring rgw_create_pools will create pools with the given number of pgs,
# size, and type. The following are some important notes on this automatic
# pool creation:
# - The pools and associated pg_num's below are merely examples of pools that
# could be automatically created when rgws are deployed.
# - The default pg_num is 8 (from osd_pool_default_pg_num) for pool created
# if rgw_create_pools isn't declared and configured.
# - A pgcalc tool should be used to determine the optimal sizes for
# the rgw.buckets.data, rgw.buckets.index pools as well as any other
# pools declared in this dictionary.
# https://ceph.io/pgcalc is the upstream pgcalc tool
# https://access.redhat.com/labsinfo/cephpgc is a pgcalc tool offered by
# Red Hat if you are using RHCS.
# - The default value of {{ rgw_zone }} is 'default'.
# - The type must be set as either 'replicated' or 'ec' for
# each pool.
# - If a pool's type is 'ec', k and m values must be set via
# the ec_k, and ec_m variables.
# - The rule_name key can be used with a specific crush rule value (must exist).
# If the key doesn't exist it falls back to the default replicated_rule.
# This only works for replicated pool type not erasure.
# rgw_create_pools:
# "{{ rgw_zone }}.rgw.buckets.data":
# pg_num: 64
# type: ec
# ec_profile: myecprofile
# ec_k: 5
# ec_m: 3
# "{{ rgw_zone }}.rgw.buckets.index":
# pg_num: 16
# size: 3
# type: replicated
# "{{ rgw_zone }}.rgw.meta":
# pg_num: 8
# size: 3
# type: replicated
# "{{ rgw_zone }}.rgw.log":
# pg_num: 8
# size: 3
# type: replicated
# "{{ rgw_zone }}.rgw.control":
# pg_num: 8
# size: 3
# type: replicated
# rule_name: foo
##########
# DOCKER #
##########
# Resource limitation
# For the whole list of limits you can apply see: docs.docker.com/engine/admin/resource_constraints
# Default values are based from: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2/html/red_hat_ceph_storage_hardware_guide/minimum_recommendations
#ceph_rgw_docker_memory_limit: "4096m"
#ceph_rgw_docker_cpu_limit: 8
# ceph_rgw_docker_cpuset_cpus: "0,2,4,6,8,10,12,14,16"
# ceph_rgw_docker_cpuset_mems: "0"
#ceph_config_keys: [] # DON'T TOUCH ME
#rgw_config_keys: "/" # DON'T TOUCH ME
# If you want to add parameters, you should retain the existing ones and include the new ones.
#ceph_rgw_container_params:
# volumes:
# - /var/lib/ceph/radosgw/{{ cluster }}-rgw.{{ rgw_zone }}.{{ ansible_facts['hostname'] }}.${INST_NAME}:/var/lib/ceph/radosgw/{{ cluster }}-rgw.{{ rgw_zone }}.{{ ansible_facts['hostname'] }}.${INST_NAME}:z
# args:
# - -f
# - -n=client.rgw.{{ rgw_zone }}.{{ ansible_facts['hostname'] }}.${INST_NAME}
# - -k=/var/lib/ceph/radosgw/{{ cluster }}-rgw.{{ rgw_zone }}.{{ ansible_facts['hostname'] }}.${INST_NAME}/keyring
###########
# SYSTEMD #
###########
# ceph_rgw_systemd_overrides will override the systemd settings
# for the ceph-rgw services.
# For example,to set "PrivateDevices=false" you can specify:
# ceph_rgw_systemd_overrides:
# Service:
# PrivateDevices: false

View File

@ -0,0 +1,7 @@
Infrastructure playbooks
========================
This directory contains a variety of playbooks that can be used independently of the Ceph roles we have.
They aim to perform infrastructure related tasks that would help use managing a Ceph cluster or performing certain operational tasks.
To use them, run `ansible-playbook infrastructure-playbooks/<playbook>`.

View File

@ -0,0 +1,128 @@
---
# This playbook is used to add a new MON to
# an existing cluster. It can run from any machine. Even if the fetch
# directory is not present it will be created.
#
# Ensure that all monitors are present in the mons
# group in your inventory so that the ceph configuration file
# is created correctly for the new OSD(s).
- name: Pre-requisites operations for adding new monitor(s)
hosts: mons
gather_facts: false
vars:
delegate_facts_host: true
become: true
pre_tasks:
- name: Import raw_install_python tasks
ansible.builtin.import_tasks: "{{ playbook_dir }}/../raw_install_python.yml"
- name: Gather facts
ansible.builtin.setup:
gather_subset:
- 'all'
- '!facter'
- '!ohai'
when: not delegate_facts_host | bool or inventory_hostname in groups.get(client_group_name, [])
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Gather and delegate facts
ansible.builtin.setup:
gather_subset:
- 'all'
- '!facter'
- '!ohai'
delegate_to: "{{ item }}"
delegate_facts: true
with_items: "{{ groups[mon_group_name] }}"
run_once: true
when: delegate_facts_host | bool
tasks:
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
- name: Import ceph-validate role
ansible.builtin.import_role:
name: ceph-validate
- name: Import ceph-infra role
ansible.builtin.import_role:
name: ceph-infra
- name: Import ceph-handler role
ansible.builtin.import_role:
name: ceph-handler
- name: Import ceph-common role
ansible.builtin.import_role:
name: ceph-common
when: not containerized_deployment | bool
- name: Import ceph-container-engine role
ansible.builtin.import_role:
name: ceph-container-engine
when: containerized_deployment | bool
- name: Import ceph-container-common role
ansible.builtin.import_role:
name: ceph-container-common
when: containerized_deployment | bool
- name: Deploy Ceph monitors
hosts: mons
gather_facts: false
become: true
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
- name: Import ceph-handler role
ansible.builtin.import_role:
name: ceph-handler
- name: Import ceph-config role
ansible.builtin.import_role:
name: ceph-config
- name: Import ceph-mon role
ansible.builtin.import_role:
name: ceph-mon
- name: Import ceph-crash role
ansible.builtin.import_role:
name: ceph-crash
when: containerized_deployment | bool
- name: Import ceph-exporter role
ansible.builtin.import_role:
name: ceph-exporter
when: containerized_deployment | bool
- name: Update config file on OSD nodes
hosts: osds
gather_facts: true
become: true
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
- name: Import ceph-handler role
ansible.builtin.import_role:
name: ceph-handler
- name: Import ceph-config role
ansible.builtin.import_role:
name: ceph-config

View File

@ -0,0 +1,109 @@
---
# Copyright Red Hat
# SPDX-License-Identifier: Apache-2.0
#
# This playbook can help in order to backup some Ceph files and restore them later.
#
# Usage:
#
# ansible-playbook -i <inventory> backup-and-restore-ceph-files.yml -e backup_dir=<backup directory path> -e mode=<backup|restore> -e target_node=<inventory_name>
#
# Required run-time variables
# ------------------
# backup_dir : a path where files will be read|write.
# mode : tell the playbook either to backup or restore files.
# target_node : the name of the node being processed, it must match the name set in the inventory.
#
# Examples
# --------
# ansible-playbook -i hosts, backup-and-restore-ceph-files.yml -e backup_dir=/usr/share/ceph-ansible/backup-ceph-files -e mode=backup -e target_node=mon01
# ansible-playbook -i hosts, backup-and-restore-ceph-files.yml -e backup_dir=/usr/share/ceph-ansible/backup-ceph-files -e mode=restore -e target_node=mon01
- name: Backup and restore Ceph files
hosts: localhost
become: true
gather_facts: true
tasks:
- name: Exit playbook, if user did not set the source node
ansible.builtin.fail:
msg: >
"You must pass the node name: -e target_node=<inventory_name>.
The name must match what is set in your inventory."
when:
- target_node is not defined
or target_node not in groups.get('all', [])
- name: Exit playbook, if user did not set the backup directory
ansible.builtin.fail:
msg: >
"you must pass the backup directory path: -e backup_dir=<backup directory path>"
when: backup_dir is not defined
- name: Exit playbook, if user did not set the playbook mode (backup|restore)
ansible.builtin.fail:
msg: >
"you must pass the mode: -e mode=<backup|restore>"
when:
- mode is not defined
or mode not in ['backup', 'restore']
- name: Gather facts on source node
ansible.builtin.setup:
delegate_to: "{{ target_node }}"
delegate_facts: true
- name: Backup mode
when: mode == 'backup'
block:
- name: Create a temp directory
ansible.builtin.tempfile:
state: directory
suffix: ansible-archive-ceph
register: tmp_dir
delegate_to: "{{ target_node }}"
- name: Archive files
community.general.archive:
path: "{{ item }}"
dest: "{{ tmp_dir.path }}/backup{{ item | replace('/', '-') }}.tar"
format: tar
mode: "0644"
delegate_to: "{{ target_node }}"
loop:
- /etc/ceph
- /var/lib/ceph
- name: Create backup directory
become: false
ansible.builtin.file:
path: "{{ backup_dir }}/{{ hostvars[target_node]['ansible_facts']['hostname'] }}"
state: directory
mode: "0755"
- name: Backup files
ansible.builtin.fetch:
src: "{{ tmp_dir.path }}/backup{{ item | replace('/', '-') }}.tar"
dest: "{{ backup_dir }}/{{ hostvars[target_node]['ansible_facts']['hostname'] }}/backup{{ item | replace('/', '-') }}.tar"
flat: true
loop:
- /etc/ceph
- /var/lib/ceph
delegate_to: "{{ target_node }}"
- name: Remove temp directory
ansible.builtin.file:
path: "{{ tmp_dir.path }}"
state: absent
delegate_to: "{{ target_node }}"
- name: Restore mode
when: mode == 'restore'
block:
- name: Unarchive files
ansible.builtin.unarchive:
src: "{{ backup_dir }}/{{ hostvars[target_node]['ansible_facts']['hostname'] }}/backup{{ item | replace('/', '-') }}.tar"
dest: "{{ item | dirname }}"
loop:
- /etc/ceph
- /var/lib/ceph
delegate_to: "{{ target_node }}"

View File

@ -0,0 +1,74 @@
---
# This playbook is used to manage CephX Keys
# You will find examples below on how the module can be used on daily operations
#
# It currently runs on localhost
- name: CephX key management examples
hosts: localhost
gather_facts: false
vars:
cluster: ceph
container_exec_cmd: "docker exec ceph-nano"
keys_to_info:
- client.admin
- mds.0
keys_to_delete:
- client.leseb
- client.leseb1
- client.pythonnnn
keys_to_create:
- { name: client.pythonnnn, caps: { mon: "allow rwx", mds: "allow *" }, mode: "0600" }
- { name: client.existpassss, caps: { mon: "allow r", osd: "allow *" }, mode: "0600" }
- { name: client.path, caps: { mon: "allow r", osd: "allow *" }, mode: "0600" }
tasks:
- name: Create ceph key(s) module
ceph_key:
name: "{{ item.name }}"
caps: "{{ item.caps }}"
cluster: "{{ cluster }}"
secret: "{{ item.key | default('') }}"
containerized: "{{ container_exec_cmd | default(False) }}"
with_items: "{{ keys_to_create }}"
- name: Update ceph key(s)
ceph_key:
name: "{{ item.name }}"
state: update
caps: "{{ item.caps }}"
cluster: "{{ cluster }}"
containerized: "{{ container_exec_cmd | default(False) }}"
with_items: "{{ keys_to_create }}"
- name: Delete ceph key(s)
ceph_key:
name: "{{ item }}"
state: absent
cluster: "{{ cluster }}"
containerized: "{{ container_exec_cmd | default(False) }}"
with_items: "{{ keys_to_delete }}"
- name: Info ceph key(s)
ceph_key_info:
name: "{{ item }}"
state: info
cluster: "{{ cluster }}"
containerized: "{{ container_exec_cmd }}"
register: key_info
ignore_errors: true
with_items: "{{ keys_to_info }}"
- name: List ceph key(s)
ceph_key_info:
state: list
cluster: "{{ cluster }}"
containerized: "{{ container_exec_cmd | default(False) }}"
register: list_keys
ignore_errors: true
- name: Fetch_initial_keys # noqa: ignore-errors
ceph_key:
state: fetch_initial_keys
cluster: "{{ cluster }}"
ignore_errors: true

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,383 @@
---
- name: Gather facts and prepare system for cephadm
hosts:
- "{{ mon_group_name|default('mons') }}"
- "{{ osd_group_name|default('osds') }}"
- "{{ mds_group_name|default('mdss') }}"
- "{{ rgw_group_name|default('rgws') }}"
- "{{ mgr_group_name|default('mgrs') }}"
- "{{ rbdmirror_group_name|default('rbdmirrors') }}"
- "{{ nfs_group_name|default('nfss') }}"
- "{{ monitoring_group_name|default('monitoring') }}"
become: true
gather_facts: false
vars:
delegate_facts_host: true
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Validate if monitor group doesn't exist or empty
ansible.builtin.fail:
msg: "you must add a [mons] group and add at least one node."
run_once: true
when: groups[mon_group_name] is undefined or groups[mon_group_name] | length == 0
- name: Validate if manager group doesn't exist or empty
ansible.builtin.fail:
msg: "you must add a [mgrs] group and add at least one node."
run_once: true
when: groups[mgr_group_name] is undefined or groups[mgr_group_name] | length == 0
- name: Validate dashboard configuration
when: dashboard_enabled | bool
run_once: true
block:
- name: Fail if [monitoring] group doesn't exist or empty
ansible.builtin.fail:
msg: "you must add a [monitoring] group and add at least one node."
when: groups[monitoring_group_name] is undefined or groups[monitoring_group_name] | length == 0
- name: Fail when dashboard_admin_password is not set
ansible.builtin.fail:
msg: "you must set dashboard_admin_password."
when: dashboard_admin_password is undefined
- name: Validate container registry credentials
ansible.builtin.fail:
msg: 'ceph_docker_registry_username and/or ceph_docker_registry_password variables need to be set'
when:
- ceph_docker_registry_auth | bool
- (ceph_docker_registry_username is not defined or ceph_docker_registry_password is not defined) or
(ceph_docker_registry_username | length == 0 or ceph_docker_registry_password | length == 0)
- name: Gather facts
ansible.builtin.setup:
gather_subset:
- 'all'
- '!facter'
- '!ohai'
when: not delegate_facts_host | bool
- name: Gather and delegate facts
ansible.builtin.setup:
gather_subset:
- 'all'
- '!facter'
- '!ohai'
delegate_to: "{{ item }}"
delegate_facts: true
with_items: "{{ groups['all'] }}"
run_once: true
when: delegate_facts_host | bool
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: container_binary.yml
- name: Check if it is atomic host
ansible.builtin.stat:
path: /run/ostree-booted
register: stat_ostree
- name: Set_fact is_atomic
ansible.builtin.set_fact:
is_atomic: "{{ stat_ostree.stat.exists }}"
- name: Import ceph-container-engine role
ansible.builtin.import_role:
name: ceph-container-engine
- name: Import ceph-container-common role
ansible.builtin.import_role:
name: ceph-container-common
tasks_from: registry.yml
when: ceph_docker_registry_auth | bool
- name: Configure repository for installing cephadm
vars:
ceph_origin: repository
ceph_repository: community
block:
- name: Validate repository variables
ansible.builtin.import_role:
name: ceph-validate
tasks_from: check_repository.yml
- name: Configure repository
ansible.builtin.import_role:
name: ceph-common
tasks_from: "configure_repository.yml"
- name: Install cephadm requirements
ansible.builtin.package:
name: ['python3', 'lvm2']
register: result
until: result is succeeded
- name: Install cephadm
ansible.builtin.package:
name: cephadm
register: result
until: result is succeeded
- name: Set_fact cephadm_cmd
ansible.builtin.set_fact:
cephadm_cmd: "cephadm {{ '--docker' if container_binary == 'docker' else '' }}"
- name: Bootstrap the cluster
hosts: "{{ mon_group_name|default('mons') }}[0]"
become: true
gather_facts: false
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: set_monitor_address.yml
- name: Create /etc/ceph directory
ansible.builtin.file:
path: /etc/ceph
state: directory
mode: "0755"
- name: Bootstrap the new cluster
cephadm_bootstrap:
mon_ip: "{{ _monitor_addresses[inventory_hostname] }}"
image: "{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}"
docker: "{{ true if container_binary == 'docker' else false }}"
pull: false
dashboard: "{{ dashboard_enabled }}"
dashboard_user: "{{ dashboard_admin_user if dashboard_enabled | bool else omit }}"
dashboard_password: "{{ dashboard_admin_password if dashboard_enabled | bool else omit }}"
monitoring: false
firewalld: "{{ configure_firewall }}"
ssh_user: "{{ cephadm_ssh_user | default('root') }}"
ssh_config: "{{ cephadm_ssh_config | default(omit) }}"
- name: Set default container image in ceph configuration
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} config set global container_image {{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Set container image base in ceph configuration
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} config set mgr mgr/cephadm/container_image_base {{ ceph_docker_registry }}/{{ ceph_docker_image }}"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Set dashboard container image in ceph mgr configuration
when: dashboard_enabled | bool
block:
- name: Set alertmanager container image in ceph configuration
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} config set mgr mgr/cephadm/container_image_alertmanager {{ alertmanager_container_image }}"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Set grafana container image in ceph configuration
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} config set mgr mgr/cephadm/container_image_grafana {{ grafana_container_image }}"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Set node-exporter container image in ceph configuration
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} config set mgr mgr/cephadm/container_image_node_exporter {{ node_exporter_container_image }}"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Set prometheus container image in ceph configuration
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} config set mgr mgr/cephadm/container_image_prometheus {{ prometheus_container_image }}"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Add the other nodes
hosts:
- "{{ mon_group_name|default('mons') }}"
- "{{ osd_group_name|default('osds') }}"
- "{{ mds_group_name|default('mdss') }}"
- "{{ rgw_group_name|default('rgws') }}"
- "{{ mgr_group_name|default('mgrs') }}"
- "{{ rbdmirror_group_name|default('rbdmirrors') }}"
- "{{ nfs_group_name|default('nfss') }}"
- "{{ monitoring_group_name|default('monitoring') }}"
become: true
gather_facts: false
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Get the cephadm ssh pub key
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} cephadm get-pub-key"
changed_when: false
run_once: true
register: cephadm_pubpkey
delegate_to: '{{ groups[mon_group_name][0] }}'
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Allow cephadm key
ansible.posix.authorized_key:
user: "{{ cephadm_ssh_user | default('root') }}"
key: '{{ cephadm_pubpkey.stdout }}'
- name: Run cephadm prepare-host
ansible.builtin.command: cephadm prepare-host
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Manage nodes with cephadm - ipv4
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} orch host add {{ ansible_facts['hostname'] }} {{ ansible_facts['all_ipv4_addresses'] | ips_in_ranges(public_network.split(',')) | first }} {{ group_names | join(' ') }} {{ '_admin' if mon_group_name | default('mons') in group_names else '' }}"
changed_when: false
delegate_to: '{{ groups[mon_group_name][0] }}'
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
when: ip_version == 'ipv4'
- name: Manage nodes with cephadm - ipv6
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} orch host add {{ ansible_facts['hostname'] }} {{ ansible_facts['all_ipv6_addresses'] | ips_in_ranges(public_network.split(',')) | last | ansible.utils.ipwrap }} {{ group_names | join(' ') }} {{ '_admin' if mon_group_name | default('mons') in group_names else '' }}"
changed_when: false
delegate_to: '{{ groups[mon_group_name][0] }}'
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
when: ip_version == 'ipv6'
- name: Add ceph label for core component
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} orch host label add {{ ansible_facts['hostname'] }} ceph"
changed_when: false
delegate_to: '{{ groups[mon_group_name][0] }}'
when: inventory_hostname in groups.get(mon_group_name, []) or
inventory_hostname in groups.get(osd_group_name, []) or
inventory_hostname in groups.get(mds_group_name, []) or
inventory_hostname in groups.get(rgw_group_name, []) or
inventory_hostname in groups.get(mgr_group_name, []) or
inventory_hostname in groups.get(rbdmirror_group_name, [])
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Adjust service placement
hosts: "{{ mon_group_name|default('mons') }}[0]"
become: true
gather_facts: false
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Update the placement of monitor hosts
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} orch apply mon --placement='label:{{ mon_group_name }}'"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Waiting for the monitor to join the quorum...
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} quorum_status --format json"
changed_when: false
register: ceph_health_raw
until: (ceph_health_raw.stdout | from_json)["quorum_names"] | length == groups.get(mon_group_name, []) | length
retries: "{{ health_mon_check_retries }}"
delay: "{{ health_mon_check_delay }}"
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Update the placement of manager hosts
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} orch apply mgr --placement='label:{{ mgr_group_name }}'"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Update the placement of crash hosts
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} orch apply crash --placement='label:ceph'"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Update the placement of ceph-exporter hosts
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} orch apply ceph-exporter --placement='label:ceph'"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Adjust monitoring service placement
hosts: "{{ monitoring_group_name|default('monitoring') }}"
become: true
gather_facts: false
tasks:
- name: Import ceph-defaults
ansible.builtin.import_role:
name: ceph-defaults
- name: With dashboard enabled
when: dashboard_enabled | bool
delegate_to: '{{ groups[mon_group_name][0] }}'
run_once: true
block:
- name: Enable the prometheus module
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} mgr module enable prometheus"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Update the placement of alertmanager hosts
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} orch apply alertmanager --placement='label:{{ monitoring_group_name }}'"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Update the placement of grafana hosts
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} orch apply grafana --placement='label:{{ monitoring_group_name }}'"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Update the placement of prometheus hosts
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} orch apply prometheus --placement='label:{{ monitoring_group_name }}'"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Update the placement of node-exporter hosts
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} orch apply node-exporter --placement='*'"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Print information
hosts: "{{ mon_group_name|default('mons') }}[0]"
become: true
gather_facts: false
tasks:
- name: Import ceph-defaults
ansible.builtin.import_role:
name: ceph-defaults
- name: Show ceph orchestrator services
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} orch ls --refresh"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Show ceph orchestrator daemons
ansible.builtin.command: "{{ cephadm_cmd }} shell -- ceph --cluster {{ cluster }} orch ps --refresh"
changed_when: false
environment:
CEPHADM_IMAGE: '{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}'
- name: Inform users about cephadm
ansible.builtin.debug:
msg: |
This Ceph cluster is now ready to receive more configuration like
adding OSD, MDS daemons, create pools or keyring.
You can do this by using the cephadm CLI and you don't need to use
ceph-ansible playbooks anymore.

View File

@ -0,0 +1,236 @@
---
# This playbook is intended to be used as part of the el7 to el8 OS upgrade.
# It modifies the systemd unit files so containers are launched with podman
# instead of docker after the OS reboot once it is upgraded.
# It is *not* intended to restart services since we don't want to multiple services
# restarts.
- name: Pre-requisite and facts gathering
hosts:
- mons
- osds
- mdss
- rgws
- nfss
- rbdmirrors
- clients
- mgrs
- monitoring
gather_facts: false
become: true
any_errors_fatal: true
vars:
delegate_facts_host: true
pre_tasks:
- name: Import raw_install_python tasks
ansible.builtin.import_tasks: "{{ playbook_dir }}/../raw_install_python.yml"
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
# pre-tasks for following import -
- name: Gather facts
ansible.builtin.setup:
gather_subset:
- 'all'
- '!facter'
- '!ohai'
when: not delegate_facts_host | bool or inventory_hostname in groups.get(client_group_name, [])
- name: Gather and delegate facts
ansible.builtin.setup:
gather_subset:
- 'all'
- '!facter'
- '!ohai'
delegate_to: "{{ item }}"
delegate_facts: true
with_items: "{{ groups['all'] | difference(groups.get(client_group_name | default('clients'), [])) }}"
run_once: true
when: delegate_facts_host | bool
- name: Migrate to podman
hosts:
- "{{ mon_group_name | default('mons') }}"
- "{{ osd_group_name | default('osds') }}"
- "{{ mds_group_name | default('mdss') }}"
- "{{ rgw_group_name | default('rgws') }}"
- "{{ nfs_group_name | default('nfss') }}"
- "{{ mgr_group_name | default('mgrs') }}"
- "{{ rbdmirror_group_name | default('rbdmirrors') }}"
- "{{ monitoring_group_name | default('monitoring') }}"
gather_facts: false
become: true
tasks:
- name: Set_fact docker2podman and container_binary
ansible.builtin.set_fact:
docker2podman: true
container_binary: podman
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
- name: Import ceph-handler role
ansible.builtin.import_role:
name: ceph-handler
- name: Install podman
ansible.builtin.package:
name: podman
state: present
register: result
until: result is succeeded
tags: with_pkg
when: not is_atomic | bool
- name: Check podman presence # noqa command-instead-of-shell
ansible.builtin.shell: command -v podman
register: podman_presence
changed_when: false
failed_when: false
- name: Pulling images from docker daemon
when: podman_presence.rc == 0
block:
- name: Pulling Ceph container image from docker daemon
ansible.builtin.command: "{{ timeout_command }} {{ container_binary }} pull docker-daemon:{{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}"
changed_when: false
register: pull_image
until: pull_image.rc == 0
retries: "{{ docker_pull_retry }}"
delay: 10
when: inventory_hostname in groups.get(mon_group_name, []) or
inventory_hostname in groups.get(osd_group_name, []) or
inventory_hostname in groups.get(mds_group_name, []) or
inventory_hostname in groups.get(rgw_group_name, []) or
inventory_hostname in groups.get(mgr_group_name, []) or
inventory_hostname in groups.get(rbdmirror_group_name, []) or
inventory_hostname in groups.get(nfs_group_name, [])
- name: Pulling alertmanager/grafana/prometheus images from docker daemon
ansible.builtin.command: "{{ timeout_command }} {{ container_binary }} pull docker-daemon:{{ item }}"
changed_when: false
register: pull_image
until: pull_image.rc == 0
retries: "{{ docker_pull_retry }}"
delay: 10
loop:
- "{{ alertmanager_container_image }}"
- "{{ grafana_container_image }}"
- "{{ prometheus_container_image }}"
when:
- dashboard_enabled | bool
- inventory_hostname in groups.get(monitoring_group_name, [])
- name: Pulling node_exporter image from docker daemon
ansible.builtin.command: "{{ timeout_command }} {{ container_binary }} pull docker-daemon:{{ node_exporter_container_image }}"
changed_when: false
register: pull_image
until: pull_image.rc == 0
retries: "{{ docker_pull_retry }}"
delay: 10
when: dashboard_enabled | bool
- name: Import ceph-mon role
ansible.builtin.import_role:
name: ceph-mon
tasks_from: systemd.yml
when: inventory_hostname in groups.get(mon_group_name, [])
- name: Import ceph-mds role
ansible.builtin.import_role:
name: ceph-mds
tasks_from: systemd.yml
when: inventory_hostname in groups.get(mds_group_name, [])
- name: Import ceph-mgr role
ansible.builtin.import_role:
name: ceph-mgr
tasks_from: systemd.yml
when: inventory_hostname in groups.get(mgr_group_name, [])
- name: Import ceph-nfs role
ansible.builtin.import_role:
name: ceph-nfs
tasks_from: systemd.yml
when: inventory_hostname in groups.get(nfs_group_name, [])
- name: Import ceph-osd role
ansible.builtin.import_role:
name: ceph-osd
tasks_from: systemd.yml
when: inventory_hostname in groups.get(osd_group_name, [])
- name: Import ceph-rbd-mirror role
ansible.builtin.import_role:
name: ceph-rbd-mirror
tasks_from: systemd.yml
when: inventory_hostname in groups.get(rbdmirror_group_name, [])
- name: Import ceph-rgw role
ansible.builtin.import_role:
name: ceph-rgw
tasks_from: systemd.yml
when: inventory_hostname in groups.get(rgw_group_name, [])
- name: Import ceph-crash role
ansible.builtin.import_role:
name: ceph-crash
tasks_from: systemd.yml
when: inventory_hostname in groups.get(mon_group_name, []) or
inventory_hostname in groups.get(osd_group_name, []) or
inventory_hostname in groups.get(mds_group_name, []) or
inventory_hostname in groups.get(rgw_group_name, []) or
inventory_hostname in groups.get(mgr_group_name, []) or
inventory_hostname in groups.get(rbdmirror_group_name, [])
- name: Import ceph-exporter role
ansible.builtin.import_role:
name: ceph-exporter
tasks_from: systemd.yml
when: inventory_hostname in groups.get(mon_group_name, []) or
inventory_hostname in groups.get(osd_group_name, []) or
inventory_hostname in groups.get(mds_group_name, []) or
inventory_hostname in groups.get(rgw_group_name, []) or
inventory_hostname in groups.get(mgr_group_name, []) or
inventory_hostname in groups.get(rbdmirror_group_name, [])
- name: Dashboard configuration
when: dashboard_enabled | bool
block:
- name: Import ceph-node-exporter role
ansible.builtin.import_role:
name: ceph-node-exporter
tasks_from: systemd.yml
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: grafana.yml
when: inventory_hostname in groups.get(monitoring_group_name, [])
- name: Import ceph-grafana role
ansible.builtin.import_role:
name: ceph-grafana
tasks_from: systemd.yml
when: inventory_hostname in groups.get(monitoring_group_name, [])
- name: Import ceph-prometheus role
ansible.builtin.import_role:
name: ceph-prometheus
tasks_from: systemd.yml
when: inventory_hostname in groups.get(monitoring_group_name, [])
- name: Reload systemd daemon
ansible.builtin.systemd:
daemon_reload: true

View File

@ -0,0 +1,39 @@
---
- name: Gather ceph logs
hosts:
- mons
- osds
- mdss
- rgws
- nfss
- rbdmirrors
- clients
- mgrs
gather_facts: false
become: true
tasks:
- name: Create a temp directory
ansible.builtin.tempfile:
state: directory
prefix: ceph_ansible
run_once: true
register: localtempfile
become: false
delegate_to: localhost
- name: Set_fact lookup_ceph_config - lookup keys, conf and logs
ansible.builtin.find:
paths:
- /etc/ceph
- /var/log/ceph
register: ceph_collect
- name: Collect ceph logs, config and keys on the machine running ansible
ansible.builtin.fetch:
src: "{{ item.path }}"
dest: "{{ localtempfile.path }}"
fail_on_missing: false
flat: false
with_items: "{{ ceph_collect.files }}"

View File

@ -0,0 +1,100 @@
---
- name: Creates logical volumes for the bucket index or fs journals on a single device.
become: true
hosts: osds
vars:
logfile: |
Suggested cut and paste under "lvm_volumes:" in "group_vars/osds.yml"
-----------------------------------------------------------------------------------------------------------
{% for lv in nvme_device_lvs %}
- data: {{ lv.lv_name }}
data_vg: {{ nvme_vg_name }}
journal: {{ lv.journal_name }}
journal_vg: {{ nvme_vg_name }}
{% endfor %}
{% for hdd in hdd_devices %}
- data: {{ hdd_lv_prefix }}-{{ hdd.split('/')[-1] }}
data_vg: {{ hdd_vg_prefix }}-{{ hdd.split('/')[-1] }}
journal: {{ hdd_journal_prefix }}-{{ hdd.split('/')[-1] }}
journal_vg: {{ nvme_vg_name }}
{% endfor %}
tasks:
- name: Include vars of lv_vars.yaml
ansible.builtin.include_vars:
file: lv_vars.yaml # noqa missing-import
failed_when: false
# ensure nvme_device is set
- name: Fail if nvme_device is not defined
ansible.builtin.fail:
msg: "nvme_device has not been set by the user"
when: nvme_device is undefined or nvme_device == 'dummy'
# need to check if lvm2 is installed
- name: Install lvm2
ansible.builtin.package:
name: lvm2
state: present
register: result
until: result is succeeded
# Make entire nvme device a VG
- name: Add nvme device as lvm pv
community.general.lvg:
force: true
pvs: "{{ nvme_device }}"
pesize: 4
state: present
vg: "{{ nvme_vg_name }}"
- name: Create lvs for fs journals for the bucket index on the nvme device
community.general.lvol:
lv: "{{ item.journal_name }}"
vg: "{{ nvme_vg_name }}"
size: "{{ journal_size }}"
pvs: "{{ nvme_device }}"
with_items: "{{ nvme_device_lvs }}"
- name: Create lvs for fs journals for hdd devices
community.general.lvol:
lv: "{{ hdd_journal_prefix }}-{{ item.split('/')[-1] }}"
vg: "{{ nvme_vg_name }}"
size: "{{ journal_size }}"
with_items: "{{ hdd_devices }}"
- name: Create the lv for data portion of the bucket index on the nvme device
community.general.lvol:
lv: "{{ item.lv_name }}"
vg: "{{ nvme_vg_name }}"
size: "{{ item.size }}"
pvs: "{{ nvme_device }}"
with_items: "{{ nvme_device_lvs }}"
# Make sure all hdd devices have a unique volume group
- name: Create vgs for all hdd devices
community.general.lvg:
force: true
pvs: "{{ item }}"
pesize: 4
state: present
vg: "{{ hdd_vg_prefix }}-{{ item.split('/')[-1] }}"
with_items: "{{ hdd_devices }}"
- name: Create lvs for the data portion on hdd devices
community.general.lvol:
lv: "{{ hdd_lv_prefix }}-{{ item.split('/')[-1] }}"
vg: "{{ hdd_vg_prefix }}-{{ item.split('/')[-1] }}"
size: "{{ hdd_lv_size }}"
pvs: "{{ item }}"
with_items: "{{ hdd_devices }}"
- name: Write output for osds.yml
become: false
ansible.builtin.copy:
content: "{{ logfile }}"
dest: "{{ logfile_path }}"
mode: preserve
delegate_to: localhost

View File

@ -0,0 +1,109 @@
---
- name: Tear down existing osd filesystems then logical volumes, volume groups, and physical volumes
become: true
hosts: osds
vars_prompt:
- name: ireallymeanit # noqa: name[casing]
prompt: Are you sure you want to tear down the logical volumes?
default: 'no'
private: false
tasks:
- name: Exit playbook, if user did not mean to tear down logical volumes
ansible.builtin.fail:
msg: >
"Exiting lv-teardown playbook, logical volumes were NOT torn down.
To tear down the logical volumes, either say 'yes' on the prompt or
or use `-e ireallymeanit=yes` on the command line when
invoking the playbook"
when: ireallymeanit != 'yes'
- name: Include vars of lv_vars.yaml
ansible.builtin.include_vars:
file: lv_vars.yaml # noqa missing-import
failed_when: false
# need to check if lvm2 is installed
- name: Install lvm2
ansible.builtin.package:
name: lvm2
state: present
register: result
until: result is succeeded
# BEGIN TEARDOWN
- name: Find any existing osd filesystems
ansible.builtin.shell: |
set -o pipefail;
grep /var/lib/ceph/osd /proc/mounts | awk '{print $2}'
register: old_osd_filesystems
changed_when: false
- name: Tear down any existing osd filesystem
ansible.posix.mount:
path: "{{ item }}"
state: unmounted
with_items: "{{ old_osd_filesystems.stdout_lines }}"
- name: Kill all lvm commands that may have been hung
ansible.builtin.command: "killall -q lvcreate pvcreate vgcreate lvconvert || echo -n"
failed_when: false
changed_when: false
## Logcal Vols
- name: Tear down existing lv for bucket index
community.general.lvol:
lv: "{{ item.lv_name }}"
vg: "{{ nvme_vg_name }}"
state: absent
force: true
with_items: "{{ nvme_device_lvs }}"
- name: Tear down any existing hdd data lvs
community.general.lvol:
lv: "{{ hdd_lv_prefix }}-{{ item.split('/')[-1] }}"
vg: "{{ hdd_vg_prefix }}-{{ item.split('/')[-1] }}"
state: absent
force: true
with_items: "{{ hdd_devices }}"
- name: Tear down any existing lv of journal for bucket index
community.general.lvol:
lv: "{{ item.journal_name }}"
vg: "{{ nvme_vg_name }}"
state: absent
force: true
with_items: "{{ nvme_device_lvs }}"
- name: Tear down any existing lvs of hdd journals
community.general.lvol:
lv: "{{ hdd_journal_prefix }}-{{ item.split('/')[-1] }}"
vg: "{{ nvme_vg_name }}"
state: absent
force: true
with_items: "{{ hdd_devices }}"
## Volume Groups
- name: Remove vg on nvme device
community.general.lvg:
vg: "{{ nvme_vg_name }}"
state: absent
force: true
- name: Remove vg for each hdd device
community.general.lvg:
vg: "{{ hdd_vg_prefix }}-{{ item.split('/')[-1] }}"
state: absent
force: true
with_items: "{{ hdd_devices }}"
## Physical Vols
- name: Tear down pv for nvme device
ansible.builtin.command: "pvremove --force --yes {{ nvme_device }}"
changed_when: false
- name: Tear down pv for each hdd device
ansible.builtin.command: "pvremove --force --yes {{ item }}"
changed_when: false
with_items: "{{ hdd_devices }}"

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1 @@
purge-cluster.yml

View File

@ -0,0 +1,222 @@
---
# This playbook purges the Ceph MGR Dashboard and Monitoring
# (alertmanager/prometheus/grafana/node-exporter) stack.
# It removes: packages, configuration files and ALL THE DATA
#
# Use it like this:
# ansible-playbook purge-dashboard.yml
# Prompts for confirmation to purge, defaults to no and
# doesn't purge anything. yes purges the dashboard and
# monitoring stack.
#
# ansible-playbook -e ireallymeanit=yes|no purge-dashboard.yml
# Overrides the prompt using -e option. Can be used in
# automation scripts to avoid interactive prompt.
- name: Confirm whether user really meant to purge the dashboard
hosts: localhost
gather_facts: false
vars_prompt:
- name: ireallymeanit # noqa: name[casing]
prompt: Are you sure you want to purge the dashboard?
default: 'no'
private: false
tasks:
- name: Exit playbook, if user did not mean to purge dashboard
ansible.builtin.fail:
msg: >
"Exiting purge-dashboard playbook, dashboard was NOT purged.
To purge the dashboard, either say 'yes' on the prompt or
or use `-e ireallymeanit=yes` on the command line when
invoking the playbook"
when: ireallymeanit != 'yes'
- name: Import_role ceph-defaults
ansible.builtin.import_role:
name: ceph-defaults
- name: Gather facts on all hosts
hosts:
- "{{ mon_group_name|default('mons') }}"
- "{{ osd_group_name|default('osds') }}"
- "{{ mds_group_name|default('mdss') }}"
- "{{ rgw_group_name|default('rgws') }}"
- "{{ rbdmirror_group_name|default('rbdmirrors') }}"
- "{{ nfs_group_name|default('nfss') }}"
- "{{ client_group_name|default('clients') }}"
- "{{ mgr_group_name|default('mgrs') }}"
- "{{ monitoring_group_name | default('monitoring') }}"
become: true
tasks:
- name: Gather facts on all Ceph hosts for following reference
ansible.builtin.debug:
msg: "gather facts on all Ceph hosts for following reference"
- name: Purge node exporter
hosts:
- "{{ mon_group_name|default('mons') }}"
- "{{ osd_group_name|default('osds') }}"
- "{{ mds_group_name|default('mdss') }}"
- "{{ rgw_group_name|default('rgws') }}"
- "{{ rbdmirror_group_name|default('rbdmirrors') }}"
- "{{ nfs_group_name|default('nfss') }}"
- "{{ client_group_name|default('clients') }}"
- "{{ mgr_group_name|default('mgrs') }}"
- "{{ monitoring_group_name | default('monitoring') }}"
gather_facts: false
become: true
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: container_binary
- name: Disable node_exporter service
ansible.builtin.service:
name: node_exporter
state: stopped
enabled: false
failed_when: false
- name: Remove node_exporter service files
ansible.builtin.file:
name: "{{ item }}"
state: absent
loop:
- /etc/systemd/system/node_exporter.service
- /run/node_exporter.service-cid
- name: Remove node-exporter image
ansible.builtin.command: "{{ container_binary }} rmi {{ node_exporter_container_image }}"
changed_when: false
failed_when: false
- name: Purge ceph monitoring
hosts: "{{ monitoring_group_name | default('monitoring') }}"
gather_facts: false
become: true
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: container_binary
- name: Stop services
ansible.builtin.service:
name: "{{ item }}"
state: stopped
enabled: false
failed_when: false
loop:
- alertmanager
- prometheus
- grafana-server
- name: Remove systemd service files
ansible.builtin.file:
name: "{{ item }}"
state: absent
loop:
- /etc/systemd/system/alertmanager.service
- /etc/systemd/system/prometheus.service
- /etc/systemd/system/grafana-server.service
- /run/alertmanager.service-cid
- /run/prometheus.service-cid
- /run/grafana-server.service-cid
- name: Remove ceph dashboard container images
ansible.builtin.command: "{{ container_binary }} rmi {{ item }}"
loop:
- "{{ alertmanager_container_image }}"
- "{{ prometheus_container_image }}"
- "{{ grafana_container_image }}"
changed_when: false
failed_when: false
- name: Remove ceph-grafana-dashboards package on RedHat or SUSE
ansible.builtin.package:
name: ceph-grafana-dashboards
state: absent
when:
- not containerized_deployment | bool
- ansible_facts['os_family'] in ['RedHat', 'Suse']
- name: Remove data
ansible.builtin.file:
name: "{{ item }}"
state: absent
loop:
- "{{ alertmanager_conf_dir }}"
- "{{ prometheus_conf_dir }}"
- /etc/grafana
- "{{ alertmanager_data_dir }}"
- "{{ prometheus_data_dir }}"
- /var/lib/grafana
- name: Purge ceph dashboard
hosts: "{{ groups[mgr_group_name] | default(groups[mon_group_name]) | default(omit) }}"
gather_facts: false
become: true
environment:
CEPH_CONTAINER_IMAGE: "{{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag if containerized_deployment | bool else None }}"
CEPH_CONTAINER_BINARY: "{{ container_binary }}"
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: container_binary
- name: Remove the dashboard admin user
ceph_dashboard_user:
name: "{{ dashboard_admin_user }}"
cluster: "{{ cluster }}"
state: absent
run_once: true
delegate_to: "{{ groups[mon_group_name][0] }}"
- name: Remove radosgw system user
radosgw_user:
name: "{{ dashboard_rgw_api_user_id }}"
cluster: "{{ cluster }}"
state: absent
run_once: true
delegate_to: "{{ groups[mon_group_name][0] }}"
when: groups.get(rgw_group_name, []) | length > 0
- name: Disable mgr dashboard and prometheus modules
ceph_mgr_module:
name: "{{ item }}"
cluster: "{{ cluster }}"
state: disable
run_once: true
delegate_to: "{{ groups[mon_group_name][0] }}"
loop:
- dashboard
- prometheus
- name: Remove TLS certificate and key files
ansible.builtin.file:
name: "/etc/ceph/ceph-dashboard.{{ item }}"
state: absent
loop:
- crt
- key
when: dashboard_protocol == "https"
- name: Remove ceph-mgr-dashboard package
ansible.builtin.package:
name: ceph-mgr-dashboard
state: absent
when: not containerized_deployment | bool

View File

@ -0,0 +1,65 @@
# This example playbook is used to add rgw users and buckets
#
# This example is run on your local machine
#
# Ensure that your local machine can connect to rgw of your cluster
#
# You will need to update the following vars
#
# rgw_host
# port
# admin_access_key
# admin_secret_key
#
# Additionally modify the users list and buckets list to create the
# users and buckets you want
#
- name: Add rgw users and buckets
connection: local
hosts: localhost
gather_facts: false
tasks:
- name: Add rgw users and buckets
ceph_add_users_buckets:
rgw_host: '172.20.0.2'
port: 8000
admin_access_key: '8W56BITCSX27CD555Z5B'
admin_secret_key: 'JcrsUNDNPAvnAWHiBmwKOzMNreOIw2kJWAclQQ20'
users:
- username: 'test1'
fullname: 'tester'
email: 'dan1@email.com'
maxbucket: 666
suspend: false
autogenkey: false
accesskey: 'B3AR4Q33L59YV56A9A2F'
secretkey: 'd84BRnMysnVGSyZiRlYUMduVgIarQWiNMdKzrF76'
userquota: true
usermaxsize: '1000'
usermaxobjects: 3
bucketquota: true
bucketmaxsize: '1000'
bucketmaxobjects: 3
- username: 'test2'
fullname: 'tester'
buckets:
- bucket: 'bucket1'
user: 'test2'
- bucket: 'bucket2'
user: 'test1'
- bucket: 'bucket3'
user: 'test1'
- bucket: 'bucket4'
user: 'test1'
- bucket: 'bucket5'
user: 'test1'
- bucket: 'bucket6'
user: 'test2'
- bucket: 'bucket7'
user: 'test2'
- bucket: 'bucket8'
user: 'test2'
- bucket: 'bucket9'
user: 'test2'
- bucket: 'bucket10'
user: 'test2'

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,177 @@
---
# This playbook removes the Ceph MDS from your cluster.
#
# Use it like this:
# ansible-playbook shrink-mds.yml -e mds_to_kill=ceph-mds01
# Prompts for confirmation to shrink, defaults to no and
# doesn't shrink the cluster. yes shrinks the cluster.
#
# ansible-playbook -e ireallymeanit=yes|no shrink-mds.yml
# Overrides the prompt using -e option. Can be used in
# automation scripts to avoid interactive prompt.
- name: Gather facts and check the init system
hosts:
- "{{ mon_group_name | default('mons') }}"
- "{{ mds_group_name | default('mdss') }}"
become: true
tasks:
- name: Gather facts on all Ceph hosts for following reference
ansible.builtin.debug:
msg: gather facts on all Ceph hosts for following reference
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: container_binary
- name: Perform checks, remove mds and print cluster health
hosts: mons[0]
become: true
vars_prompt:
- name: ireallymeanit # noqa: name[casing]
prompt: Are you sure you want to shrink the cluster?
default: 'no'
private: false
pre_tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Exit playbook, if no mds was given
when: mds_to_kill is not defined
ansible.builtin.fail:
msg: >
mds_to_kill must be declared.
Exiting shrink-cluster playbook, no MDS was removed. On the command
line when invoking the playbook, you can use
"-e mds_to_kill=ceph-mds1" argument. You can only remove a single
MDS each time the playbook runs."
- name: Exit playbook, if the mds is not part of the inventory
when: mds_to_kill not in groups[mds_group_name]
ansible.builtin.fail:
msg: "It seems that the host given is not part of your inventory,
please make sure it is."
- name: Exit playbook, if user did not mean to shrink cluster
when: ireallymeanit != 'yes'
ansible.builtin.fail:
msg: "Exiting shrink-mds playbook, no mds was removed.
To shrink the cluster, either say 'yes' on the prompt or
or use `-e ireallymeanit=yes` on the command line when
invoking the playbook"
- name: Set_fact container_exec_cmd for mon0
ansible.builtin.set_fact:
container_exec_cmd: "{{ container_binary }} exec ceph-mon-{{ ansible_facts['hostname'] }}"
when: containerized_deployment | bool
- name: Exit playbook, if can not connect to the cluster
ansible.builtin.command: "{{ container_exec_cmd | default('') }} timeout 5 ceph --cluster {{ cluster }} health"
changed_when: false
register: ceph_health
until: ceph_health is succeeded
retries: 5
delay: 2
- name: Set_fact mds_to_kill_hostname
ansible.builtin.set_fact:
mds_to_kill_hostname: "{{ hostvars[mds_to_kill]['ansible_facts']['hostname'] }}"
tasks:
# get rid of this as soon as "systemctl stop ceph-msd@$HOSTNAME" also
# removes the MDS from the FS map.
- name: Exit mds when containerized deployment
ansible.builtin.command: "{{ container_exec_cmd | default('') }} ceph tell mds.{{ mds_to_kill_hostname }} exit"
changed_when: false
when: containerized_deployment | bool
- name: Get ceph status
ansible.builtin.command: "{{ container_exec_cmd | default('') }} ceph --cluster {{ cluster }} -s -f json"
register: ceph_status
changed_when: false
- name: Set_fact current_max_mds
ansible.builtin.set_fact:
current_max_mds: "{{ (ceph_status.stdout | from_json)['fsmap']['max'] }}"
- name: Fail if removing that mds node wouldn't satisfy max_mds anymore
ansible.builtin.fail:
msg: "Can't remove more mds as it won't satisfy current max_mds setting"
when:
- ((((ceph_status.stdout | from_json)['fsmap']['up'] | int) + ((ceph_status.stdout | from_json)['fsmap']['up:standby'] | int)) - 1) < current_max_mds | int
- (ceph_status.stdout | from_json)['fsmap']['up'] | int > 1
- name: Stop mds service and verify it
block:
- name: Stop mds service
ansible.builtin.service:
name: ceph-mds@{{ mds_to_kill_hostname }}
state: stopped
enabled: false
delegate_to: "{{ mds_to_kill }}"
failed_when: false
- name: Ensure that the mds is stopped
ansible.builtin.command: "systemctl is-active ceph-mds@{{ mds_to_kill_hostname }}" # noqa command-instead-of-module
register: mds_to_kill_status
failed_when: mds_to_kill_status.rc == 0
delegate_to: "{{ mds_to_kill }}"
retries: 5
delay: 2
changed_when: false
- name: Fail if the mds is reported as active or standby
block:
- name: Get new ceph status
ansible.builtin.command: "{{ container_exec_cmd | default('') }} ceph --cluster {{ cluster }} -s -f json"
register: ceph_status
changed_when: false
- name: Get active mds nodes list
ansible.builtin.set_fact:
active_mdss: "{{ active_mdss | default([]) + [item.name] }}"
with_items: "{{ (ceph_status.stdout | from_json)['fsmap']['by_rank'] }}"
- name: Get ceph fs dump status
ansible.builtin.command: "{{ container_exec_cmd | default('') }} ceph --cluster {{ cluster }} fs dump -f json"
register: ceph_fs_status
changed_when: false
- name: Create a list of standby mdss
ansible.builtin.set_fact:
standby_mdss: (ceph_fs_status.stdout | from_json)['standbys'] | map(attribute='name') | list
- name: Fail if mds just killed is being reported as active or standby
ansible.builtin.fail:
msg: "mds node {{ mds_to_kill }} still up and running."
when:
- (mds_to_kill in active_mdss | default([])) or
(mds_to_kill in standby_mdss | default([]))
- name: Delete the filesystem when killing last mds
ceph_fs:
name: "{{ cephfs }}"
cluster: "{{ cluster }}"
state: absent
when:
- (ceph_status.stdout | from_json)['fsmap']['up'] | int == 0
- (ceph_status.stdout | from_json)['fsmap']['up:standby'] | int == 0
environment:
CEPH_CONTAINER_IMAGE: "{{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag if containerized_deployment | bool else None }}"
CEPH_CONTAINER_BINARY: "{{ container_binary }}"
- name: Purge mds store
ansible.builtin.file:
path: /var/lib/ceph/mds/{{ cluster }}-{{ mds_to_kill_hostname }}
state: absent
delegate_to: "{{ mds_to_kill }}"
post_tasks:
- name: Show ceph health
ansible.builtin.command: "{{ container_exec_cmd | default('') }} ceph --cluster {{ cluster }} -s"
changed_when: false

View File

@ -0,0 +1,138 @@
---
# This playbook shrinks the Ceph manager from your cluster
#
# Use it like this:
# ansible-playbook shrink-mgr.yml -e mgr_to_kill=ceph-mgr1
# Prompts for confirmation to shrink, defaults to no and
# doesn't shrink the cluster and yes shrinks the cluster.
#
# ansible-playbook -e ireallymeanit=yes|no shrink-mgr.yml
# Overrides the prompt using -e option. Can be used in
# automation scripts to avoid interactive prompt.
- name: Gather facts and check the init system
hosts:
- "{{ mon_group_name | default('mons') }}"
- "{{ mgr_group_name | default('mgrs') }}"
become: true
tasks:
- name: Gather facts on all Ceph hosts for following reference
ansible.builtin.debug:
msg: gather facts on all Ceph hosts for following reference
- name: Confirm if user really meant to remove manager from the ceph cluster
hosts: mons[0]
become: true
vars_prompt:
- name: ireallymeanit # noqa: name[casing]
prompt: Are you sure you want to shrink the cluster?
default: 'no'
private: false
pre_tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: container_binary
- name: Set_fact container_exec_cmd
when: containerized_deployment | bool
ansible.builtin.set_fact:
container_exec_cmd: "{{ container_binary }} exec ceph-mon-{{ ansible_facts['hostname'] }}"
- name: Exit playbook, if can not connect to the cluster
ansible.builtin.command: "{{ container_exec_cmd | default('') }} timeout 5 ceph --cluster {{ cluster }} health"
register: ceph_health
changed_when: false
until: ceph_health is succeeded
retries: 5
delay: 2
- name: Get total number of mgrs in cluster
block:
- name: Save mgr dump output
ansible.builtin.command: "{{ container_exec_cmd | default('') }} ceph --cluster {{ cluster }} mgr dump -f json"
register: mgr_dump
changed_when: false
- name: Get active and standbys mgr list
ansible.builtin.set_fact:
active_mgr: "{{ [mgr_dump.stdout | from_json] | map(attribute='active_name') | list }}"
standbys_mgr: "{{ (mgr_dump.stdout | from_json)['standbys'] | map(attribute='name') | list }}"
- name: Exit playbook, if there's no standby manager
ansible.builtin.fail:
msg: "You are about to shrink the only manager present in the cluster."
when: standbys_mgr | length | int < 1
- name: Exit playbook, if no manager was given
ansible.builtin.fail:
msg: "mgr_to_kill must be declared
Exiting shrink-cluster playbook, no manager was removed.
On the command line when invoking the playbook, you can use
-e mgr_to_kill=ceph-mgr01 argument. You can only remove a single
manager each time the playbook runs."
when: mgr_to_kill is not defined
- name: Exit playbook, if user did not mean to shrink cluster
ansible.builtin.fail:
msg: "Exiting shrink-mgr playbook, no manager was removed.
To shrink the cluster, either say 'yes' on the prompt or
or use `-e ireallymeanit=yes` on the command line when
invoking the playbook"
when: ireallymeanit != 'yes'
- name: Set_fact mgr_to_kill_hostname
ansible.builtin.set_fact:
mgr_to_kill_hostname: "{{ hostvars[mgr_to_kill]['ansible_facts']['hostname'] }}"
- name: Exit playbook, if the selected manager is not present in the cluster
ansible.builtin.fail:
msg: "It seems that the host given is not present in the cluster."
when:
- mgr_to_kill_hostname not in active_mgr
- mgr_to_kill_hostname not in standbys_mgr
tasks:
- name: Stop manager services and verify it
block:
- name: Stop manager service
ansible.builtin.service:
name: ceph-mgr@{{ mgr_to_kill_hostname }}
state: stopped
enabled: false
delegate_to: "{{ mgr_to_kill }}"
failed_when: false
- name: Ensure that the mgr is stopped
ansible.builtin.command: "systemctl is-active ceph-mgr@{{ mgr_to_kill_hostname }}" # noqa command-instead-of-module
register: mgr_to_kill_status
failed_when: mgr_to_kill_status.rc == 0
delegate_to: "{{ mgr_to_kill }}"
changed_when: false
retries: 5
delay: 2
- name: Fail if the mgr is reported in ceph mgr dump
ansible.builtin.command: "{{ container_exec_cmd | default('') }} ceph --cluster {{ cluster }} mgr dump -f json"
register: mgr_dump
changed_when: false
failed_when: mgr_to_kill_hostname in (([mgr_dump.stdout | from_json] | map(attribute='active_name') | list) + (mgr_dump.stdout | from_json)['standbys'] | map(attribute='name') | list)
until: mgr_to_kill_hostname not in (([mgr_dump.stdout | from_json] | map(attribute='active_name') | list) + (mgr_dump.stdout | from_json)['standbys'] | map(attribute='name') | list)
retries: 12
delay: 10
- name: Purge manager store
ansible.builtin.file:
path: /var/lib/ceph/mgr/{{ cluster }}-{{ mgr_to_kill_hostname }}
state: absent
delegate_to: "{{ mgr_to_kill }}"
post_tasks:
- name: Show ceph health
ansible.builtin.command: "{{ container_exec_cmd | default('') }} ceph --cluster {{ cluster }} -s"
changed_when: false

View File

@ -0,0 +1,151 @@
---
# This playbook shrinks the Ceph monitors from your cluster
# It can remove a Ceph of monitor from the cluster and ALL ITS DATA
#
# Use it like this:
# ansible-playbook shrink-mon.yml -e mon_to_kill=ceph-mon01
# Prompts for confirmation to shrink, defaults to no and
# doesn't shrink the cluster. yes shrinks the cluster.
#
# ansible-playbook -e ireallymeanit=yes|no shrink-mon.yml
# Overrides the prompt using -e option. Can be used in
# automation scripts to avoid interactive prompt.
- name: Gather facts and check the init system
hosts: "{{ mon_group_name|default('mons') }}"
become: true
tasks:
- name: Gather facts on all Ceph hosts for following reference
ansible.builtin.debug:
msg: "gather facts on all Ceph hosts for following reference"
- name: Confirm whether user really meant to remove monitor from the ceph cluster
hosts: mons[0]
become: true
vars_prompt:
- name: ireallymeanit # noqa: name[casing]
prompt: Are you sure you want to shrink the cluster?
default: 'no'
private: false
vars:
mon_group_name: mons
pre_tasks:
- name: Exit playbook, if only one monitor is present in cluster
ansible.builtin.fail:
msg: "You are about to shrink the only monitor present in the cluster.
If you really want to do that, please use the purge-cluster playbook."
when: groups[mon_group_name] | length | int == 1
- name: Exit playbook, if no monitor was given
ansible.builtin.fail:
msg: "mon_to_kill must be declared
Exiting shrink-cluster playbook, no monitor was removed.
On the command line when invoking the playbook, you can use
-e mon_to_kill=ceph-mon01 argument. You can only remove a single monitor each time the playbook runs."
when: mon_to_kill is not defined
- name: Exit playbook, if the monitor is not part of the inventory
ansible.builtin.fail:
msg: "It seems that the host given is not part of your inventory, please make sure it is."
when: mon_to_kill not in groups[mon_group_name]
- name: Exit playbook, if user did not mean to shrink cluster
ansible.builtin.fail:
msg: "Exiting shrink-mon playbook, no monitor was removed.
To shrink the cluster, either say 'yes' on the prompt or
or use `-e ireallymeanit=yes` on the command line when
invoking the playbook"
when: ireallymeanit != 'yes'
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: container_binary
tasks:
- name: Pick a monitor different than the one we want to remove
ansible.builtin.set_fact:
mon_host: "{{ item }}"
with_items: "{{ groups[mon_group_name] }}"
when: item != mon_to_kill
- name: Set container_exec_cmd fact
ansible.builtin.set_fact:
container_exec_cmd: "{{ container_binary }} exec ceph-mon-{{ hostvars[mon_host]['ansible_facts']['hostname'] }}"
when: containerized_deployment | bool
- name: Exit playbook, if can not connect to the cluster
ansible.builtin.command: "{{ container_exec_cmd }} timeout 5 ceph --cluster {{ cluster }} health"
register: ceph_health
changed_when: false
until: ceph_health.stdout.find("HEALTH") > -1
delegate_to: "{{ mon_host }}"
retries: 5
delay: 2
- name: Set_fact mon_to_kill_hostname
ansible.builtin.set_fact:
mon_to_kill_hostname: "{{ hostvars[mon_to_kill]['ansible_facts']['hostname'] }}"
- name: Stop monitor service(s)
ansible.builtin.service:
name: ceph-mon@{{ mon_to_kill_hostname }}
state: stopped
enabled: false
delegate_to: "{{ mon_to_kill }}"
failed_when: false
- name: Purge monitor store
ansible.builtin.file:
path: /var/lib/ceph/mon/{{ cluster }}-{{ mon_to_kill_hostname }}
state: absent
delegate_to: "{{ mon_to_kill }}"
- name: Remove monitor from the quorum
ansible.builtin.command: "{{ container_exec_cmd }} ceph --cluster {{ cluster }} mon remove {{ mon_to_kill_hostname }}"
changed_when: false
failed_when: false
delegate_to: "{{ mon_host }}"
post_tasks:
- name: Verify the monitor is out of the cluster
ansible.builtin.command: "{{ container_exec_cmd }} ceph --cluster {{ cluster }} quorum_status -f json"
delegate_to: "{{ mon_host }}"
changed_when: false
failed_when: false
register: result
until: mon_to_kill_hostname not in (result.stdout | from_json)['quorum_names']
retries: 2
delay: 10
- name: Please remove the monitor from your ceph configuration file
ansible.builtin.debug:
msg: "The monitor has been successfully removed from the cluster.
Please remove the monitor entry from the rest of your ceph configuration files, cluster wide."
run_once: true
when: mon_to_kill_hostname not in (result.stdout | from_json)['quorum_names']
- name: Fail if monitor is still part of the cluster
ansible.builtin.fail:
msg: "Monitor appears to still be part of the cluster, please check what happened."
run_once: true
when: mon_to_kill_hostname in (result.stdout | from_json)['quorum_names']
- name: Show ceph health
ansible.builtin.command: "{{ container_exec_cmd }} ceph --cluster {{ cluster }} -s"
delegate_to: "{{ mon_host }}"
changed_when: false
- name: Show ceph mon status
ansible.builtin.command: "{{ container_exec_cmd }} ceph --cluster {{ cluster }} mon stat"
delegate_to: "{{ mon_host }}"
changed_when: false

View File

@ -0,0 +1,379 @@
---
# This playbook shrinks Ceph OSDs that have been created with ceph-volume.
# It can remove any number of OSD(s) from the cluster and ALL THEIR DATA
#
# Use it like this:
# ansible-playbook shrink-osd.yml -e osd_to_kill=0,2,6
# Prompts for confirmation to shrink, defaults to no and
# doesn't shrink the cluster. yes shrinks the cluster.
#
# ansible-playbook -e ireallymeanit=yes|no shrink-osd.yml
# Overrides the prompt using -e option. Can be used in
# automation scripts to avoid interactive prompt.
- name: Gather facts and check the init system
hosts:
- mons
- osds
become: true
tasks:
- name: Gather facts on all Ceph hosts for following reference
ansible.builtin.debug:
msg: "gather facts on all Ceph hosts for following reference"
- name: Confirm whether user really meant to remove osd(s) from the cluster
hosts: mons[0]
become: true
vars_prompt:
- name: ireallymeanit # noqa: name[casing]
prompt: Are you sure you want to shrink the cluster?
default: 'no'
private: false
vars:
mon_group_name: mons
osd_group_name: osds
pre_tasks:
- name: Exit playbook, if user did not mean to shrink cluster
ansible.builtin.fail:
msg: "Exiting shrink-osd playbook, no osd(s) was/were removed..
To shrink the cluster, either say 'yes' on the prompt or
or use `-e ireallymeanit=yes` on the command line when
invoking the playbook"
when: ireallymeanit != 'yes'
- name: Exit playbook, if no osd(s) was/were given
ansible.builtin.fail:
msg: "osd_to_kill must be declared
Exiting shrink-osd playbook, no OSD(s) was/were removed.
On the command line when invoking the playbook, you can use
-e osd_to_kill=0,1,2,3 argument."
when: osd_to_kill is not defined
- name: Check the osd ids passed have the correct format
ansible.builtin.fail:
msg: "The id {{ item }} has wrong format, please pass the number only"
with_items: "{{ osd_to_kill.split(',') }}"
when: not item is regex("^\d+$")
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: container_binary
post_tasks:
- name: Set_fact container_exec_cmd build docker exec command (containerized)
ansible.builtin.set_fact:
container_exec_cmd: "{{ container_binary }} exec ceph-mon-{{ ansible_facts['hostname'] }}"
when: containerized_deployment | bool
- name: Exit playbook, if can not connect to the cluster
ansible.builtin.command: "{{ container_exec_cmd }} timeout 5 ceph --cluster {{ cluster }} health"
register: ceph_health
changed_when: false
until: ceph_health.stdout.find("HEALTH") > -1
retries: 5
delay: 2
- name: Find the host(s) where the osd(s) is/are running on
ansible.builtin.command: "{{ container_exec_cmd }} ceph --cluster {{ cluster }} osd find {{ item }}"
changed_when: false
with_items: "{{ osd_to_kill.split(',') }}"
register: find_osd_hosts
- name: Set_fact osd_hosts
ansible.builtin.set_fact:
osd_hosts: "{{ osd_hosts | default([]) + [[(item.stdout | from_json).crush_location.host, (item.stdout | from_json).osd_fsid, item.item]] }}"
with_items: "{{ find_osd_hosts.results }}"
- name: Set_fact _osd_hosts
ansible.builtin.set_fact:
_osd_hosts: "{{ _osd_hosts | default([]) + [ [ item.0, item.2, item.3 ] ] }}"
with_nested:
- "{{ groups.get(osd_group_name) }}"
- "{{ osd_hosts }}"
when: hostvars[item.0]['ansible_facts']['hostname'] == item.1
- name: Set_fact host_list
ansible.builtin.set_fact:
host_list: "{{ host_list | default([]) | union([item.0]) }}"
loop: "{{ _osd_hosts }}"
- name: Get ceph-volume lvm list data
ceph_volume:
cluster: "{{ cluster }}"
action: list
environment:
CEPH_CONTAINER_IMAGE: "{{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag if containerized_deployment | bool else None }}"
CEPH_CONTAINER_BINARY: "{{ container_binary }}"
register: _lvm_list_data
delegate_to: "{{ item }}"
loop: "{{ host_list }}"
- name: Set_fact _lvm_list
ansible.builtin.set_fact:
_lvm_list: "{{ _lvm_list | default({}) | combine(item.stdout | from_json) }}"
with_items: "{{ _lvm_list_data.results }}"
- name: Refresh /etc/ceph/osd files non containerized_deployment
ceph_volume_simple_scan:
cluster: "{{ cluster }}"
force: true
delegate_to: "{{ item }}"
loop: "{{ host_list }}"
when: not containerized_deployment | bool
- name: Get osd unit status
ansible.builtin.systemd:
name: ceph-osd@{{ item.2 }}
register: osd_status
delegate_to: "{{ item.0 }}"
loop: "{{ _osd_hosts }}"
when:
- containerized_deployment | bool
- name: Refresh /etc/ceph/osd files containerized_deployment
ansible.builtin.command: "{{ container_binary }} exec ceph-osd-{{ item.2 }} ceph-volume simple scan --force /var/lib/ceph/osd/{{ cluster }}-{{ item.2 }}"
changed_when: false
delegate_to: "{{ item.0 }}"
loop: "{{ _osd_hosts }}"
when:
- containerized_deployment | bool
- item.2 not in _lvm_list.keys()
- osd_status.results[0].status.ActiveState == 'active'
- name: Refresh /etc/ceph/osd files containerized_deployment when OSD container is down
when:
- containerized_deployment | bool
- osd_status.results[0].status.ActiveState != 'active'
block:
- name: Create tmp osd folder
ansible.builtin.file:
path: /var/lib/ceph/tmp/{{ cluster }}-{{ item.2 }}
state: directory
mode: '0755'
delegate_to: "{{ item.0 }}"
when: item.2 not in _lvm_list.keys()
loop: "{{ _osd_hosts }}"
- name: Activate OSD
ansible.builtin.command: |
{{ container_binary }} run -ti --pids-limit=-1 --rm --net=host --privileged=true --pid=host --ipc=host --cpus=1
-v /dev:/dev -v /etc/localtime:/etc/localtime:ro
-v /var/lib/ceph/tmp/:/var/lib/ceph/osd:z,rshared
-v /etc/ceph:/etc/ceph:z -v /var/run/ceph:/var/run/ceph:z
-v /var/run/udev/:/var/run/udev/ -v /var/log/ceph:/var/log/ceph:z
-e OSD_BLUESTORE=1 -e OSD_FILESTORE=0 -e OSD_DMCRYPT=0 -e CLUSTER=ceph -e DEBUG=verbose
-e TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728 -v /run/lvm/:/run/lvm/
-e CEPH_DAEMON=OSD_CEPH_VOLUME_ACTIVATE -e CONTAINER_IMAGE={{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag }}
-e OSD_ID={{ item.2 }}
--entrypoint=ceph-volume
{{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag }}
simple activate {{ item.2 }} {{ item.1 }} --no-systemd
changed_when: false
delegate_to: "{{ item.0 }}"
when: item.2 not in _lvm_list.keys()
loop: "{{ _osd_hosts }}"
- name: Simple scan
ansible.builtin.command: |
{{ container_binary }} run -ti --pids-limit=-1 --rm --net=host --privileged=true --pid=host --ipc=host --cpus=1
-v /dev:/dev -v /etc/localtime:/etc/localtime:ro
-v /var/lib/ceph/tmp/:/var/lib/ceph/osd:z,rshared
-v /etc/ceph:/etc/ceph:z -v /var/run/ceph:/var/run/ceph:z
-v /var/run/udev/:/var/run/udev/ -v /var/log/ceph:/var/log/ceph:z
-e OSD_BLUESTORE=1 -e OSD_FILESTORE=0 -e OSD_DMCRYPT=0 -e CLUSTER=ceph -e DEBUG=verbose
-e TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728 -v /run/lvm/:/run/lvm/
-e CEPH_DAEMON=OSD_CEPH_VOLUME_ACTIVATE -e CONTAINER_IMAGE={{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag }}
-e OSD_ID={{ item.2 }}
--entrypoint=ceph-volume
{{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag }}
simple scan --force /var/lib/ceph/osd/{{ cluster }}-{{ item.2 }}
changed_when: false
delegate_to: "{{ item.0 }}"
when: item.2 not in _lvm_list.keys()
loop: "{{ _osd_hosts }}"
- name: Umount OSD temp folder
ansible.posix.mount:
path: /var/lib/ceph/tmp/{{ cluster }}-{{ item.2 }}
state: unmounted
delegate_to: "{{ item.0 }}"
when: item.2 not in _lvm_list.keys()
loop: "{{ _osd_hosts }}"
- name: Remove OSD temp folder
ansible.builtin.file:
path: /var/lib/ceph/tmp/{{ cluster }}-{{ item.2 }}
state: absent
delegate_to: "{{ item.0 }}"
when: item.2 not in _lvm_list.keys()
loop: "{{ _osd_hosts }}"
- name: Find /etc/ceph/osd files
ansible.builtin.find:
paths: /etc/ceph/osd
pattern: "{{ item.2 }}-*"
register: ceph_osd_data
delegate_to: "{{ item.0 }}"
loop: "{{ _osd_hosts }}"
when: item.2 not in _lvm_list.keys()
- name: Slurp ceph osd files content
ansible.builtin.slurp:
src: "{{ item['files'][0]['path'] }}"
delegate_to: "{{ item.item.0 }}"
register: ceph_osd_files_content
loop: "{{ ceph_osd_data.results }}"
when:
- item.skipped is undefined
- item.matched > 0
- name: Set_fact ceph_osd_files_json
ansible.builtin.set_fact:
ceph_osd_data_json: "{{ ceph_osd_data_json | default({}) | combine({ item.item.item.2: item.content | b64decode | from_json}) }}"
with_items: "{{ ceph_osd_files_content.results }}"
when: item.skipped is undefined
- name: Mark osd(s) out of the cluster
ceph_osd:
ids: "{{ osd_to_kill.split(',') }}"
cluster: "{{ cluster }}"
state: out
environment:
CEPH_CONTAINER_IMAGE: "{{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag if containerized_deployment | bool else None }}"
CEPH_CONTAINER_BINARY: "{{ container_binary }}"
run_once: true
- name: Stop osd(s) service
ansible.builtin.service:
name: ceph-osd@{{ item.2 }}
state: stopped
enabled: false
loop: "{{ _osd_hosts }}"
delegate_to: "{{ item.0 }}"
- name: Umount osd lockbox
ansible.posix.mount:
path: "/var/lib/ceph/osd-lockbox/{{ ceph_osd_data_json[item.2]['data']['uuid'] }}"
state: absent
loop: "{{ _osd_hosts }}"
delegate_to: "{{ item.0 }}"
when:
- not containerized_deployment | bool
- item.2 not in _lvm_list.keys()
- ceph_osd_data_json[item.2]['encrypted'] | default(False) | bool
- ceph_osd_data_json[item.2]['data']['uuid'] is defined
- name: Umount osd data
ansible.posix.mount:
path: "/var/lib/ceph/osd/{{ cluster }}-{{ item.2 }}"
state: absent
loop: "{{ _osd_hosts }}"
delegate_to: "{{ item.0 }}"
when: not containerized_deployment | bool
- name: Get parent device for data partition
ansible.builtin.command: lsblk --noheadings --output PKNAME --nodeps "{{ ceph_osd_data_json[item.2]['data']['path'] }}"
register: parent_device_data_part
loop: "{{ _osd_hosts }}"
delegate_to: "{{ item.0 }}"
changed_when: false
when:
- item.2 not in _lvm_list.keys()
- ceph_osd_data_json[item.2]['data']['path'] is defined
- name: Add pkname information in ceph_osd_data_json
ansible.builtin.set_fact:
ceph_osd_data_json: "{{ ceph_osd_data_json | default({}) | combine({item.item[2]: {'pkname_data': '/dev/' + item.stdout}}, recursive=True) }}"
loop: "{{ parent_device_data_part.results }}"
when: item.skipped is undefined
- name: Close dmcrypt close on devices if needed
ansible.builtin.command: "cryptsetup close {{ ceph_osd_data_json[item.2][item.3]['uuid'] }}"
with_nested:
- "{{ _osd_hosts }}"
- ['block_dmcrypt', 'block.db_dmcrypt', 'block.wal_dmcrypt', 'data', 'journal_dmcrypt']
delegate_to: "{{ item.0 }}"
failed_when: false
register: result
until: result is succeeded
changed_when: false
when:
- item.2 not in _lvm_list.keys()
- ceph_osd_data_json[item.2]['encrypted'] | default(False) | bool
- ceph_osd_data_json[item.2][item.3] is defined
- name: Use ceph-volume lvm zap to destroy all partitions
ceph_volume:
cluster: "{{ cluster }}"
action: zap
destroy: true
data: "{{ ceph_osd_data_json[item.2]['pkname_data'] if item.3 == 'data' else ceph_osd_data_json[item.2][item.3]['path'] }}"
environment:
CEPH_CONTAINER_IMAGE: "{{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag if containerized_deployment | bool else None }}"
CEPH_CONTAINER_BINARY: "{{ container_binary }}"
with_nested:
- "{{ _osd_hosts }}"
- ['block', 'block.db', 'block.wal', 'journal', 'data']
delegate_to: "{{ item.0 }}"
failed_when: false
register: result
when:
- item.2 not in _lvm_list.keys()
- ceph_osd_data_json[item.2][item.3] is defined
- name: Zap osd devices
ceph_volume:
action: "zap"
osd_fsid: "{{ item.1 }}"
environment:
CEPH_VOLUME_DEBUG: "{{ ceph_volume_debug }}"
CEPH_CONTAINER_IMAGE: "{{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag if containerized_deployment | bool else None }}"
CEPH_CONTAINER_BINARY: "{{ container_binary }}"
delegate_to: "{{ item.0 }}"
loop: "{{ _osd_hosts }}"
when: item.2 in _lvm_list.keys()
- name: Ensure osds are marked down
ceph_osd:
ids: "{{ osd_to_kill.split(',') }}"
cluster: "{{ cluster }}"
state: down
environment:
CEPH_CONTAINER_IMAGE: "{{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag if containerized_deployment | bool else None }}"
CEPH_CONTAINER_BINARY: "{{ container_binary }}"
run_once: true
delegate_to: "{{ groups[mon_group_name][0] }}"
- name: Purge osd(s) from the cluster
ceph_osd:
ids: "{{ item }}"
cluster: "{{ cluster }}"
state: purge
environment:
CEPH_CONTAINER_IMAGE: "{{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag if containerized_deployment | bool else None }}"
CEPH_CONTAINER_BINARY: "{{ container_binary }}"
run_once: true
with_items: "{{ osd_to_kill.split(',') }}"
- name: Remove osd data dir
ansible.builtin.file:
path: "/var/lib/ceph/osd/{{ cluster }}-{{ item.2 }}"
state: absent
loop: "{{ _osd_hosts }}"
delegate_to: "{{ item.0 }}"
- name: Show ceph health
ansible.builtin.command: "{{ container_exec_cmd }} ceph --cluster {{ cluster }} -s"
changed_when: false
- name: Show ceph osd tree
ansible.builtin.command: "{{ container_exec_cmd }} ceph --cluster {{ cluster }} osd tree"
changed_when: false

View File

@ -0,0 +1,128 @@
---
# This playbook removes the Ceph RBD mirror from your cluster on the given
# node.
#
# Use it like this:
# ansible-playbook shrink-rbdmirror.yml -e rbdmirror_to_kill=ceph-rbdmirror01
# Prompts for confirmation to shrink, defaults to no and
# doesn't shrink the cluster. yes shrinks the cluster.
#
# ansible-playbook -e ireallymeanit=yes|no shrink-rbdmirror.yml
# Overrides the prompt using -e option. Can be used in
# automation scripts to avoid interactive prompt.
- name: Gather facts and check the init system
hosts:
- mons
- rbdmirrors
become: true
tasks:
- name: Gather facts on MONs and RBD mirrors
ansible.builtin.debug:
msg: gather facts on MONs and RBD mirrors
- name: Confirm whether user really meant to remove rbd mirror from the ceph
cluster
hosts: mons[0]
become: true
vars_prompt:
- name: ireallymeanit # noqa: name[casing]
prompt: Are you sure you want to shrink the cluster?
default: 'no'
private: false
pre_tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: container_binary
- name: Exit playbook, if no rbdmirror was given
ansible.builtin.fail:
msg: "rbdmirror_to_kill must be declared
Exiting shrink-cluster playbook, no RBD mirror was removed.
On the command line when invoking the playbook, you can use
-e rbdmirror_to_kill=rbd-mirror01 argument. You can only remove a
single rbd mirror each time the playbook runs."
when: rbdmirror_to_kill is not defined
- name: Exit playbook, if the rbdmirror is not part of the inventory
ansible.builtin.fail:
msg: >
It seems that the host given is not part of your inventory,
please make sure it is.
when: rbdmirror_to_kill not in groups[rbdmirror_group_name]
- name: Exit playbook, if user did not mean to shrink cluster
ansible.builtin.fail:
msg: "Exiting shrink-rbdmirror playbook, no rbd-mirror was removed.
To shrink the cluster, either say 'yes' on the prompt or
or use `-e ireallymeanit=yes` on the command line when
invoking the playbook"
when: ireallymeanit != 'yes'
- name: Set_fact container_exec_cmd for mon0
when: containerized_deployment | bool
ansible.builtin.set_fact:
container_exec_cmd: "{{ container_binary }} exec ceph-mon-{{ ansible_facts['hostname'] }}"
- name: Exit playbook, if can not connect to the cluster
ansible.builtin.command: "{{ container_exec_cmd | default('') }} timeout 5 ceph --cluster {{ cluster }} service dump -f json"
register: ceph_health
changed_when: false
until: ceph_health is succeeded
retries: 5
delay: 2
- name: Set_fact rbdmirror_to_kill_hostname
ansible.builtin.set_fact:
rbdmirror_to_kill_hostname: "{{ hostvars[rbdmirror_to_kill]['ansible_facts']['hostname'] }}"
- name: Set_fact rbdmirror_gids
ansible.builtin.set_fact:
rbdmirror_gids: "{{ rbdmirror_gids | default([]) + [item] }}"
with_items: "{{ (ceph_health.stdout | from_json)['services']['rbd-mirror']['daemons'].keys() | list }}"
when: item != 'summary'
- name: Set_fact rbdmirror_to_kill_gid
ansible.builtin.set_fact:
rbdmirror_to_kill_gid: "{{ (ceph_health.stdout | from_json)['services']['rbd-mirror']['daemons'][item]['gid'] }}"
with_items: "{{ rbdmirror_gids }}"
when: (ceph_health.stdout | from_json)['services']['rbd-mirror']['daemons'][item]['metadata']['id'] == rbdmirror_to_kill_hostname
tasks:
- name: Stop rbdmirror service
ansible.builtin.service:
name: ceph-rbd-mirror@rbd-mirror.{{ rbdmirror_to_kill_hostname }}
state: stopped
enabled: false
delegate_to: "{{ rbdmirror_to_kill }}"
failed_when: false
- name: Purge related directories
ansible.builtin.file:
path: /var/lib/ceph/bootstrap-rbd-mirror/{{ cluster }}-{{ rbdmirror_to_kill_hostname }}
state: absent
delegate_to: "{{ rbdmirror_to_kill }}"
post_tasks:
- name: Get servicemap details
ansible.builtin.command: "{{ container_exec_cmd | default('') }} timeout 5 ceph --cluster {{ cluster }} service dump -f json"
register: ceph_health
failed_when:
- "'rbd-mirror' in (ceph_health.stdout | from_json)['services'].keys() | list"
- rbdmirror_to_kill_gid in (ceph_health.stdout | from_json)['services']['rbd-mirror']['daemons'].keys() | list
until:
- "'rbd-mirror' in (ceph_health.stdout | from_json)['services'].keys() | list"
- rbdmirror_to_kill_gid not in (ceph_health.stdout | from_json)['services']['rbd-mirror']['daemons'].keys() | list
changed_when: false
when: rbdmirror_to_kill_gid is defined
retries: 12
delay: 10
- name: Show ceph health
ansible.builtin.command: "{{ container_exec_cmd | default('') }} ceph --cluster {{ cluster }} -s"
changed_when: false

View File

@ -0,0 +1,141 @@
---
# This playbook shrinks the Ceph RGW from your cluster
#
# Use it like this:
# ansible-playbook shrink-rgw.yml -e rgw_to_kill=ceph-rgw01
# Prompts for confirmation to shrink, defaults to no and
# doesn't shrink the cluster. yes shrinks the cluster.
#
# ansible-playbook -e ireallymeanit=yes|no shrink-rgw.yml
# Overrides the prompt using -e option. Can be used in
# automation scripts to avoid interactive prompt.
- name: Confirm whether user really meant to remove rgw from the ceph cluster
hosts: localhost
become: false
gather_facts: false
vars_prompt:
- name: ireallymeanit # noqa: name[casing]
prompt: Are you sure you want to shrink the cluster?
default: 'no'
private: false
tasks:
- name: Exit playbook, if no rgw was given
when: rgw_to_kill is not defined or rgw_to_kill | length == 0
ansible.builtin.fail:
msg: >
rgw_to_kill must be declared.
Exiting shrink-cluster playbook, no RGW was removed. On the command
line when invoking the playbook, you can use
"-e rgw_to_kill=ceph.rgw0 argument". You can only remove a single
RGW each time the playbook runs.
- name: Exit playbook, if user did not mean to shrink cluster
when: ireallymeanit != 'yes'
ansible.builtin.fail:
msg: >
Exiting shrink-mon playbook, no monitor was removed. To shrink the
cluster, either say 'yes' on the prompt or use
'-e ireallymeanit=yes' on the command line when invoking the playbook
- name: Gather facts and mons and rgws
hosts:
- "{{ mon_group_name | default('mons') }}[0]"
- "{{ rgw_group_name | default('rgws') }}"
become: true
gather_facts: false
tasks:
- name: Gather facts
ansible.builtin.setup:
gather_subset:
- 'all'
- '!facter'
- '!ohai'
- name: Shrink rgw service
hosts: mons[0]
become: true
gather_facts: false
pre_tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: container_binary
- name: Set_fact container_exec_cmd for mon0
ansible.builtin.set_fact:
container_exec_cmd: "{{ container_binary }} exec ceph-mon-{{ ansible_facts['hostname'] }}"
when: containerized_deployment | bool
- name: Exit playbook, if can not connect to the cluster
ansible.builtin.command: "{{ container_exec_cmd | default('') }} timeout 5 ceph --cluster {{ cluster }} health"
register: ceph_health
changed_when: false
until: ceph_health is succeeded
retries: 5
delay: 2
- name: Get rgw instances
ansible.builtin.command: "{{ container_exec_cmd | default('') }} ceph --cluster {{ cluster }} service dump -f json"
register: rgw_instances
changed_when: false
- name: Exit playbook, if the rgw_to_kill doesn't exist
when: rgw_to_kill not in (rgw_instances.stdout | from_json).services.rgw.daemons.keys() | list
ansible.builtin.fail:
msg: >
It seems that the rgw instance given is not part of the ceph cluster. Please
make sure it is.
The rgw instance format is $(hostname}.rgw$(instance number).
tasks:
- name: Get rgw host running the rgw instance to kill
ansible.builtin.set_fact:
rgw_host: '{{ item }}'
with_items: '{{ groups[rgw_group_name] }}'
when: hostvars[item]['ansible_facts']['hostname'] == rgw_to_kill.split('.')[0]
- name: Stop rgw service
ansible.builtin.service:
name: ceph-radosgw@rgw.{{ rgw_to_kill }}
state: stopped
enabled: false
delegate_to: "{{ rgw_host }}"
failed_when: false
- name: Ensure that the rgw is stopped
ansible.builtin.command: "systemctl is-active ceph-radosgw@rgw.{{ rgw_to_kill }}" # noqa command-instead-of-module
register: rgw_to_kill_status
failed_when: rgw_to_kill_status.rc == 0
changed_when: false
delegate_to: "{{ rgw_host }}"
retries: 5
delay: 2
- name: Exit if rgw_to_kill is reported in ceph status
ansible.builtin.command: "{{ container_exec_cmd | default('') }} ceph --cluster {{ cluster }} service dump -f json"
register: ceph_status
changed_when: false
failed_when:
- (ceph_status.stdout | from_json).services.rgw is defined
- rgw_to_kill in (ceph_status.stdout | from_json).services.rgw.daemons.keys() | list
until:
- (ceph_status.stdout | from_json).services.rgw is defined
- rgw_to_kill not in (ceph_status.stdout | from_json).services.rgw.daemons.keys() | list
retries: 3
delay: 3
- name: Purge directories related to rgw
ansible.builtin.file:
path: /var/lib/ceph/radosgw/{{ cluster }}-rgw.{{ rgw_to_kill }}
state: absent
delegate_to: "{{ rgw_host }}"
post_tasks:
- name: Show ceph health
ansible.builtin.command: "{{ container_exec_cmd | default('') }} ceph --cluster {{ cluster }} -s"
changed_when: false

View File

@ -0,0 +1,30 @@
---
# This playbook queries each OSD using `ceph-volume inventory` to report the
# entire storage device inventory of a cluster.
#
# Usage:
# ansible-playbook storage-inventory.yml
- name: Gather facts and check the init system
hosts: osds
become: true
tasks:
- name: Gather facts on all Ceph hosts
ansible.builtin.debug:
msg: "gather facts on all Ceph hosts for following reference"
- name: Query each host for storage device inventory
hosts: osds
become: true
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: List storage inventory
ceph_volume:
action: "inventory"
environment:
CEPH_VOLUME_DEBUG: "{{ ceph_volume_debug }}"
CEPH_CONTAINER_IMAGE: "{{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag if containerized_deployment | bool else None }}"
CEPH_CONTAINER_BINARY: "{{ container_binary }}"

View File

@ -0,0 +1,808 @@
---
# This playbook switches from non-containerized to containerized Ceph daemons
- name: Confirm whether user really meant to switch from non-containerized to containerized ceph daemons
hosts: localhost
gather_facts: false
any_errors_fatal: true
vars_prompt:
- name: ireallymeanit # noqa: name[casing]
prompt: Are you sure you want to switch from non-containerized to containerized ceph daemons?
default: 'no'
private: false
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Fail when less than three monitors
ansible.builtin.fail:
msg: "This playbook requires at least three monitors."
when: groups[mon_group_name] | length | int < 3
- name: Exit playbook, if user did not mean to switch from non-containerized to containerized daemons?
ansible.builtin.fail:
msg: >
"Exiting switch-from-non-containerized-to-containerized-ceph-daemons.yml playbook,
cluster did not switch from non-containerized to containerized ceph daemons.
To switch from non-containerized to containerized ceph daemons, either say 'yes' on the prompt or
or use `-e ireallymeanit=yes` on the command line when
invoking the playbook"
when: ireallymeanit != 'yes'
- name: Gather facts
hosts:
- "{{ mon_group_name|default('mons') }}"
- "{{ mgr_group_name|default('mgrs') }}"
- "{{ osd_group_name|default('osds') }}"
- "{{ mds_group_name|default('mdss') }}"
- "{{ rgw_group_name|default('rgws') }}"
- "{{ rbdmirror_group_name|default('rbdmirrors') }}"
- "{{ nfs_group_name|default('nfss') }}"
become: true
vars:
delegate_facts_host: true
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Gather and delegate facts
ansible.builtin.setup:
gather_subset:
- 'all'
- '!facter'
- '!ohai'
delegate_to: "{{ item }}"
delegate_facts: true
with_items: "{{ groups['all'] | difference(groups.get(client_group_name, [])) }}"
run_once: true
when: delegate_facts_host | bool
tags: always
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
- name: Import ceph-validate role
ansible.builtin.import_role:
name: ceph-validate
- name: Switching from non-containerized to containerized ceph mon
vars:
containerized_deployment: true
switch_to_containers: true
mon_group_name: mons
hosts: "{{ mon_group_name|default('mons') }}"
serial: 1
become: true
pre_tasks:
- name: Select a running monitor
ansible.builtin.set_fact:
mon_host: "{{ item }}"
with_items: "{{ groups[mon_group_name] }}"
when: item != inventory_hostname
- name: Stop non-containerized ceph mon
ansible.builtin.service:
name: "ceph-mon@{{ ansible_facts['hostname'] }}"
state: stopped
enabled: false
- name: Remove old systemd unit files
ansible.builtin.file:
path: "{{ item }}"
state: absent
with_items:
- /usr/lib/systemd/system/ceph-mon@.service
- /usr/lib/systemd/system/ceph-mon.target
- /lib/systemd/system/ceph-mon@.service
- /lib/systemd/system/ceph-mon.target
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
# NOTE: changed from file module to raw find command for performance reasons
# The file module has to run checks on current ownership of all directories and files. This is unnecessary
# as in this case we know we want all owned by ceph user
- name: Set proper ownership on ceph directories
ansible.builtin.command: "find /var/lib/ceph/mon /etc/ceph -not -( -user {{ ceph_uid }} -or -group {{ ceph_uid }} -) -execdir chown -h {{ ceph_uid }}:{{ ceph_uid }} {} +"
changed_when: false
- name: Check for existing old leveldb file extension (ldb)
ansible.builtin.shell: stat /var/lib/ceph/mon/*/store.db/*.ldb
changed_when: false
failed_when: false
register: ldb_files
- name: Rename leveldb extension from ldb to sst
ansible.builtin.shell: rename -v .ldb .sst /var/lib/ceph/mon/*/store.db/*.ldb
changed_when: false
failed_when: false
when: ldb_files.rc == 0
- name: Copy mon initial keyring in /etc/ceph to satisfy fetch config task in ceph-container-common
ansible.builtin.command: cp /var/lib/ceph/mon/{{ cluster }}-{{ ansible_facts['hostname'] }}/keyring /etc/ceph/{{ cluster }}.mon.keyring
args:
creates: /etc/ceph/{{ cluster }}.mon.keyring
changed_when: false
failed_when: false
tasks:
- name: Import ceph-handler role
ansible.builtin.import_role:
name: ceph-handler
- name: Import ceph-container-engine role
ansible.builtin.import_role:
name: ceph-container-engine
- name: Import ceph-container-common role
ansible.builtin.import_role:
name: ceph-container-common
- name: Import ceph-mon role
ansible.builtin.import_role:
name: ceph-mon
post_tasks:
- name: Waiting for the monitor to join the quorum...
ansible.builtin.command: "{{ container_binary }} run --rm -v /etc/ceph:/etc/ceph:z --entrypoint=ceph {{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }} --cluster {{ cluster }} quorum_status --format json"
register: ceph_health_raw
until: ansible_facts['hostname'] in (ceph_health_raw.stdout | trim | from_json)["quorum_names"]
changed_when: false
retries: "{{ health_mon_check_retries }}"
delay: "{{ health_mon_check_delay }}"
- name: Switching from non-containerized to containerized ceph mgr
hosts: "{{ mgr_group_name|default('mgrs') }}"
vars:
containerized_deployment: true
mgr_group_name: mgrs
serial: 1
become: true
pre_tasks:
# failed_when: false is here because if we're
# working with a jewel cluster then ceph mgr
# will not exist
- name: Stop non-containerized ceph mgr(s)
ansible.builtin.service:
name: "ceph-mgr@{{ ansible_facts['hostname'] }}"
state: stopped
enabled: false
failed_when: false
- name: Remove old systemd unit files
ansible.builtin.file:
path: "{{ item }}"
state: absent
with_items:
- /usr/lib/systemd/system/ceph-mgr@.service
- /usr/lib/systemd/system/ceph-mgr.target
- /lib/systemd/system/ceph-mgr@.service
- /lib/systemd/system/ceph-mgr.target
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
# NOTE: changed from file module to raw find command for performance reasons
# The file module has to run checks on current ownership of all directories and files. This is unnecessary
# as in this case we know we want all owned by ceph user
- name: Set proper ownership on ceph directories
ansible.builtin.command: "find /var/lib/ceph/mgr /etc/ceph -not -( -user {{ ceph_uid }} -or -group {{ ceph_uid }} -) -execdir chown -h {{ ceph_uid }}:{{ ceph_uid }} {} +"
changed_when: false
tasks:
- name: Import ceph-handler role
ansible.builtin.import_role:
name: ceph-handler
- name: Import ceph-container-engine role
ansible.builtin.import_role:
name: ceph-container-engine
- name: Import ceph-container-common role
ansible.builtin.import_role:
name: ceph-container-common
- name: Import ceph-mgr role
ansible.builtin.import_role:
name: ceph-mgr
- name: Set osd flags
hosts: "{{ mon_group_name | default('mons') }}[0]"
become: true
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: container_binary.yml
- name: Get pool list
ansible.builtin.command: "{{ ceph_cmd }} --cluster {{ cluster }} osd pool ls detail -f json"
register: pool_list
changed_when: false
check_mode: false
- name: Get balancer module status
ansible.builtin.command: "{{ ceph_cmd }} --cluster {{ cluster }} balancer status -f json"
register: balancer_status_switch
changed_when: false
check_mode: false
- name: Set_fact pools_pgautoscaler_mode
ansible.builtin.set_fact:
pools_pgautoscaler_mode: "{{ pools_pgautoscaler_mode | default([]) | union([{'name': item.pool_name, 'mode': item.pg_autoscale_mode}]) }}"
with_items: "{{ pool_list.stdout | default('{}') | from_json }}"
- name: Disable balancer
ansible.builtin.command: "{{ ceph_cmd }} --cluster {{ cluster }} balancer off"
changed_when: false
when: (balancer_status_switch.stdout | from_json)['active'] | bool
- name: Disable pg autoscale on pools
ceph_pool:
name: "{{ item.name }}"
cluster: "{{ cluster }}"
pg_autoscale_mode: false
with_items: "{{ pools_pgautoscaler_mode }}"
when:
- pools_pgautoscaler_mode is defined
- item.mode == 'on'
environment:
CEPH_CONTAINER_IMAGE: "{{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag if containerized_deployment | bool else None }}"
CEPH_CONTAINER_BINARY: "{{ container_binary }}"
- name: Set osd flags
ceph_osd_flag:
name: "{{ item }}"
cluster: "{{ cluster }}"
environment:
CEPH_CONTAINER_IMAGE: "{{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag if containerized_deployment | bool else None }}"
CEPH_CONTAINER_BINARY: "{{ container_binary }}"
with_items:
- noout
- nodeep-scrub
- name: Switching from non-containerized to containerized ceph osd
vars:
containerized_deployment: true
osd_group_name: osds
switch_to_containers: true
hosts: "{{ osd_group_name|default('osds') }}"
serial: 1
become: true
pre_tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Collect running osds
ansible.builtin.shell: |
set -o pipefail;
systemctl list-units | grep -E "loaded * active" | grep -Eo 'ceph-osd@[0-9]+.service|ceph-volume'
register: running_osds
changed_when: false
failed_when: false
# systemd module does not support --runtime option
- name: Disable ceph-osd@.service runtime-enabled
ansible.builtin.command: "systemctl disable --runtime {{ item }}" # noqa command-instead-of-module
changed_when: false
failed_when: false
with_items: "{{ running_osds.stdout_lines | default([]) }}"
when: item.startswith('ceph-osd@')
- name: Stop/disable/mask non-containerized ceph osd(s) (if any)
ansible.builtin.systemd:
name: "{{ item }}"
state: stopped
enabled: false
with_items: "{{ running_osds.stdout_lines | default([]) }}"
when: running_osds != []
- name: Disable ceph.target
ansible.builtin.systemd:
name: ceph.target
enabled: false
- name: Remove old ceph-osd systemd units
ansible.builtin.file:
path: "{{ item }}"
state: absent
with_items:
- /usr/lib/systemd/system/ceph-osd.target
- /usr/lib/systemd/system/ceph-osd@.service
- /usr/lib/systemd/system/ceph-volume@.service
- /lib/systemd/system/ceph-osd.target
- /lib/systemd/system/ceph-osd@.service
- /lib/systemd/system/ceph-volume@.service
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
# NOTE: changed from file module to raw find command for performance reasons
# The file module has to run checks on current ownership of all directories and files. This is unnecessary
# as in this case we know we want all owned by ceph user
- name: Set proper ownership on ceph directories
ansible.builtin.command: "find /var/lib/ceph/osd /etc/ceph -not -( -user {{ ceph_uid }} -or -group {{ ceph_uid }} -) -execdir chown -h {{ ceph_uid }}:{{ ceph_uid }} {} +"
changed_when: false
- name: Check for existing old leveldb file extension (ldb)
ansible.builtin.shell: stat /var/lib/ceph/osd/*/current/omap/*.ldb
changed_when: false
failed_when: false
register: ldb_files
- name: Rename leveldb extension from ldb to sst
ansible.builtin.shell: rename -v .ldb .sst /var/lib/ceph/osd/*/current/omap/*.ldb
changed_when: false
failed_when: false
when: ldb_files.rc == 0
- name: Check if containerized osds are already running
ansible.builtin.command: >
{{ container_binary }} ps -q --filter='name=ceph-osd'
changed_when: false
failed_when: false
register: osd_running
- name: Get osd directories
ansible.builtin.command: >
find /var/lib/ceph/osd {% if dmcrypt | bool %}/var/lib/ceph/osd-lockbox{% endif %} -maxdepth 1 -mindepth 1 -type d
register: osd_dirs
changed_when: false
failed_when: false
- name: Unmount all the osd directories
ansible.builtin.command: >
umount {{ item }}
changed_when: false
failed_when: false
with_items: "{{ osd_dirs.stdout_lines }}"
when: osd_running.rc != 0 or osd_running.stdout_lines | length == 0
tasks:
- name: Import ceph-handler role
ansible.builtin.import_role:
name: ceph-handler
- name: Import ceph-container-common role
ansible.builtin.import_role:
name: ceph-container-engine
- name: Import ceph-container-common role
ansible.builtin.import_role:
name: ceph-container-common
- name: Import ceph-osd role
ansible.builtin.import_role:
name: ceph-osd
post_tasks:
- name: Container - waiting for clean pgs...
ansible.builtin.command: >
{{ container_binary }} exec ceph-mon-{{ hostvars[groups[mon_group_name][0]]['ansible_facts']['hostname'] }} ceph --cluster {{ cluster }} pg stat --format json
register: ceph_health_post
until: >
(((ceph_health_post.stdout | from_json).pg_summary.num_pg_by_state | length) > 0)
and
(((ceph_health_post.stdout | from_json).pg_summary.num_pg_by_state | selectattr('name', 'search', '^active\\+clean') | map(attribute='num') | list | sum) == (ceph_health_post.stdout | from_json).pg_summary.num_pgs)
delegate_to: "{{ groups[mon_group_name][0] }}"
retries: "{{ health_osd_check_retries }}"
delay: "{{ health_osd_check_delay }}"
changed_when: false
- name: Unset osd flags
hosts: "{{ mon_group_name | default('mons') }}[0]"
become: true
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: container_binary.yml
- name: Re-enable pg autoscale on pools
ceph_pool:
name: "{{ item.name }}"
cluster: "{{ cluster }}"
pg_autoscale_mode: true
with_items: "{{ pools_pgautoscaler_mode }}"
when:
- pools_pgautoscaler_mode is defined
- item.mode == 'on'
environment:
CEPH_CONTAINER_IMAGE: "{{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag if containerized_deployment | bool else None }}"
CEPH_CONTAINER_BINARY: "{{ container_binary }}"
- name: Unset osd flags
ceph_osd_flag:
name: "{{ item }}"
cluster: "{{ cluster }}"
state: absent
environment:
CEPH_CONTAINER_IMAGE: "{{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag if containerized_deployment | bool else None }}"
CEPH_CONTAINER_BINARY: "{{ container_binary }}"
with_items:
- noout
- nodeep-scrub
- name: Re-enable balancer
ansible.builtin.command: "{{ ceph_cmd }} --cluster {{ cluster }} balancer on"
changed_when: false
when: (balancer_status_switch.stdout | from_json)['active'] | bool
- name: Switching from non-containerized to containerized ceph mds
hosts: "{{ mds_group_name|default('mdss') }}"
vars:
containerized_deployment: true
mds_group_name: mdss
serial: 1
become: true
pre_tasks:
- name: Stop non-containerized ceph mds(s)
ansible.builtin.service:
name: "ceph-mds@{{ ansible_facts['hostname'] }}"
state: stopped
enabled: false
- name: Remove old systemd unit files
ansible.builtin.file:
path: "{{ item }}"
state: absent
with_items:
- /usr/lib/systemd/system/ceph-mds@.service
- /usr/lib/systemd/system/ceph-mds.target
- /lib/systemd/system/ceph-mds@.service
- /lib/systemd/system/ceph-mds.target
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
# NOTE: changed from file module to raw find command for performance reasons
# The file module has to run checks on current ownership of all directories and files. This is unnecessary
# as in this case we know we want all owned by ceph user
- name: Set proper ownership on ceph directories
ansible.builtin.command: "find /var/lib/ceph/mds /etc/ceph -not -( -user {{ ceph_uid }} -or -group {{ ceph_uid }} -) -execdir chown {{ ceph_uid }}:{{ ceph_uid }} {} +"
changed_when: false
tasks:
- name: Import ceph-handler role
ansible.builtin.import_role:
name: ceph-handler
- name: Import ceph-container-engine role
ansible.builtin.import_role:
name: ceph-container-engine
- name: Import ceph-container-common role
ansible.builtin.import_role:
name: ceph-container-common
- name: Import ceph-mds role
ansible.builtin.import_role:
name: ceph-mds
- name: Switching from non-containerized to containerized ceph rgw
hosts: "{{ rgw_group_name|default('rgws') }}"
vars:
containerized_deployment: true
rgw_group_name: rgws
serial: 1
become: true
pre_tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
- name: Import ceph-config role
ansible.builtin.import_role:
name: ceph-config
tasks_from: rgw_systemd_environment_file.yml
# NOTE: changed from file module to raw find command for performance reasons
# The file module has to run checks on current ownership of all directories and files. This is unnecessary
# as in this case we know we want all owned by ceph user
- name: Set proper ownership on ceph directories
ansible.builtin.command: "find /var/lib/ceph/radosgw /etc/ceph -not -( -user {{ ceph_uid }} -or -group {{ ceph_uid }} -) -execdir chown {{ ceph_uid }}:{{ ceph_uid }} {} +"
changed_when: false
tasks:
- name: Stop non-containerized ceph rgw(s)
ansible.builtin.service:
name: "ceph-radosgw@rgw.{{ rgw_zone }}.{{ ansible_facts['hostname'] }}.{{ item.instance_name }}"
state: stopped
enabled: false
with_items: "{{ rgw_instances }}"
- name: Remove old systemd unit files
ansible.builtin.file:
path: "{{ item }}"
state: absent
with_items:
- /usr/lib/systemd/system/ceph-radosgw@.service
- /usr/lib/systemd/system/ceph-radosgw.target
- /lib/systemd/system/ceph-radosgw@.service
- /lib/systemd/system/ceph-radosgw.target
- name: Import ceph-handler role
ansible.builtin.import_role:
name: ceph-handler
- name: Import ceph-container-engine role
ansible.builtin.import_role:
name: ceph-container-engine
- name: Import ceph-container-common role
ansible.builtin.import_role:
name: ceph-container-common
- name: Import ceph-rgw role
ansible.builtin.import_role:
name: ceph-rgw
- name: Switching from non-containerized to containerized ceph rbd-mirror
hosts: "{{ rbdmirror_group_name|default('rbdmirrors') }}"
vars:
containerized_deployment: true
rbdmirror_group_name: rbdmirrors
serial: 1
become: true
pre_tasks:
- name: Check for ceph rbd mirror services
ansible.builtin.command: systemctl show --no-pager --property=Id ceph-rbd-mirror@* # noqa: command-instead-of-module
changed_when: false
register: rbdmirror_services
- name: Stop non-containerized ceph rbd mirror(s) # noqa: ignore-errors
ansible.builtin.service:
name: "{{ item.split('=')[1] }}"
state: stopped
enabled: false
ignore_errors: true
loop: "{{ rbdmirror_services.stdout_lines }}"
- name: Remove old systemd unit files
ansible.builtin.file:
path: "{{ item }}"
state: absent
with_items:
- /usr/lib/systemd/system/ceph-rbd-mirror@.service
- /usr/lib/systemd/system/ceph-rbd-mirror.target
- /lib/systemd/system/ceph-rbd-mirror@.service
- /lib/systemd/system/ceph-rbd-mirror.target
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
# NOTE: changed from file module to raw find command for performance reasons
# The file module has to run checks on current ownership of all directories and files. This is unnecessary
# as in this case we know we want all owned by ceph user
- name: Set proper ownership on ceph directories
ansible.builtin.command: "find /var/lib/ceph /etc/ceph -not -( -user {{ ceph_uid }} -or -group {{ ceph_uid }} -) -execdir chown {{ ceph_uid }}:{{ ceph_uid }} {} +"
changed_when: false
tasks:
- name: Import ceph-handler role
ansible.builtin.import_role:
name: ceph-handler
- name: Import ceph-container-engine role
ansible.builtin.import_role:
name: ceph-container-engine
- name: Import ceph-container-common role
ansible.builtin.import_role:
name: ceph-container-common
- name: Import ceph-rbd-mirror role
ansible.builtin.import_role:
name: ceph-rbd-mirror
- name: Switching from non-containerized to containerized ceph nfs
hosts: "{{ nfs_group_name|default('nfss') }}"
vars:
containerized_deployment: true
nfs_group_name: nfss
serial: 1
become: true
pre_tasks:
# failed_when: false is here because if we're
# working with a jewel cluster then ceph nfs
# will not exist
- name: Stop non-containerized ceph nfs(s)
ansible.builtin.service:
name: nfs-ganesha
state: stopped
enabled: false
failed_when: false
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
# NOTE: changed from file module to raw find command for performance reasons
# The file module has to run checks on current ownership of all directories and files. This is unnecessary
# as in this case we know we want all owned by ceph user
- name: Set proper ownership on ceph directories
ansible.builtin.command: "find /var/lib/ceph /etc/ceph -not -( -user {{ ceph_uid }} -or -group {{ ceph_uid }} -) -execdir chown {{ ceph_uid }}:{{ ceph_uid }} {} +"
changed_when: false
tasks:
- name: Import ceph-handler role
ansible.builtin.import_role:
name: ceph-handler
- name: Import ceph-container-engine role
ansible.builtin.import_role:
name: ceph-container-engine
- name: Import ceph-container-common role
ansible.builtin.import_role:
name: ceph-container-common
- name: Import ceph-nfs role
ansible.builtin.import_role:
name: ceph-nfs
- name: Switching from non-containerized to containerized ceph-crash
hosts:
- "{{ mon_group_name | default('mons') }}"
- "{{ osd_group_name | default('osds') }}"
- "{{ mds_group_name | default('mdss') }}"
- "{{ rgw_group_name | default('rgws') }}"
- "{{ rbdmirror_group_name | default('rbdmirrors') }}"
- "{{ mgr_group_name | default('mgrs') }}"
vars:
containerized_deployment: true
become: true
tasks:
- name: Stop non-containerized ceph-crash
ansible.builtin.service:
name: ceph-crash
state: stopped
enabled: false
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: container_binary.yml
- name: Import ceph-handler role
ansible.builtin.import_role:
name: ceph-handler
- name: Import ceph-crash role
ansible.builtin.import_role:
name: ceph-crash
- name: Switching from non-containerized to containerized ceph-exporter
hosts:
- "{{ mon_group_name | default('mons') }}"
- "{{ osd_group_name | default('osds') }}"
- "{{ mds_group_name | default('mdss') }}"
- "{{ rgw_group_name | default('rgws') }}"
- "{{ rbdmirror_group_name | default('rbdmirrors') }}"
- "{{ mgr_group_name | default('mgrs') }}"
vars:
containerized_deployment: true
become: true
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-facts role
ansible.builtin.import_role:
name: ceph-facts
tasks_from: container_binary.yml
- name: Import ceph-handler role
ansible.builtin.import_role:
name: ceph-handler
- name: Import ceph-exporter role
ansible.builtin.import_role:
name: ceph-exporter
- name: Final task
hosts:
- "{{ mon_group_name|default('mons') }}"
- "{{ mgr_group_name|default('mgrs') }}"
- "{{ osd_group_name|default('osds') }}"
- "{{ mds_group_name|default('mdss') }}"
- "{{ rgw_group_name|default('rgws') }}"
vars:
containerized_deployment: true
become: true
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
# NOTE: changed from file module to raw find command for performance reasons
# The file module has to run checks on current ownership of all directories and files. This is unnecessary
# as in this case we know we want all owned by ceph user
- name: Set proper ownership on ceph directories
ansible.builtin.command: "find /var/lib/ceph /etc/ceph -not -( -user {{ ceph_uid }} -or -group {{ ceph_uid }} -) -execdir chown {{ ceph_uid }}:{{ ceph_uid }} {} +"
changed_when: false

View File

@ -0,0 +1,73 @@
---
# NOTE (leseb):
# The playbook aims to takeover a cluster that was not configured with
# ceph-ansible.
#
# The procedure is as follows:
#
# 1. Install Ansible and add your monitors and osds hosts in it. For more detailed information you can read the [Ceph Ansible Wiki](https://github.com/ceph/ceph-ansible/wiki)
# 2. Set `generate_fsid: false` in `group_vars`
# 3. Get your current cluster fsid with `ceph fsid` and set `fsid` accordingly in `group_vars`
# 4. Run the playbook called: `take-over-existing-cluster.yml` like this `ansible-playbook take-over-existing-cluster.yml`.
# 5. Eventually run Ceph Ansible to validate everything by doing: `ansible-playbook site.yml`.
- name: Fetch keys
hosts: mons
become: true
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
- name: Import ceph-fetch-keys role
ansible.builtin.import_role:
name: ceph-fetch-keys
- name: Take over existing cluster
hosts:
- mons
- osds
- mdss
- rgws
- nfss
- rbdmirrors
- clients
- mgrs
become: true
tasks:
- name: Import ceph-defaults role
ansible.builtin.import_role:
name: ceph-defaults
post_tasks:
- name: Get the name of the existing ceph cluster
ansible.builtin.shell: |
set -o pipefail;
basename $(grep --exclude '*.bak' -R fsid /etc/ceph/ | egrep -o '^[^.]*' | head -n 1)
changed_when: false
register: cluster_name
- name: Run stat module on Ceph configuration file
ansible.builtin.stat:
path: "/etc/ceph/{{ cluster_name.stdout }}.conf"
register: ceph_conf_stat
# Creates a backup of original ceph conf file in 'cluster_name-YYYYMMDDTHHMMSS.conf.bak' format
- name: Make a backup of original Ceph configuration file
ansible.builtin.copy:
src: "/etc/ceph/{{ cluster_name.stdout }}.conf"
dest: "/etc/ceph/{{ cluster_name.stdout }}-{{ ansible_date_time.iso8601_basic_short }}.conf.bak"
remote_src: true
owner: "{{ ceph_conf_stat.stat.pw_name }}"
group: "{{ ceph_conf_stat.stat.gr_name }}"
mode: "{{ ceph_conf_stat.stat.mode }}"
- name: Generate ceph configuration file
openstack.config_template.config_template:
src: "roles/ceph-config/templates/ceph.conf.j2"
dest: "/etc/ceph/{{ cluster_name.stdout }}.conf"
owner: "{{ ceph_conf_stat.stat.pw_name }}"
group: "{{ ceph_conf_stat.stat.gr_name }}"
mode: "{{ ceph_conf_stat.stat.mode }}"
config_overrides: "{{ ceph_conf_overrides }}"
config_type: ini

View File

@ -0,0 +1,39 @@
---
# This playbook was made to automate Ceph servers maintenance
# Typical use case: hardware change
# By running this playbook you will set the 'noout' flag on your
# cluster, which means that OSD **can't** be marked as out
# of the CRUSH map, but they will be marked as down.
# Basically we tell the cluster to don't move any data since
# the operation won't last for too long.
- hosts: <your_host>
gather_facts: false
tasks:
- name: Set the noout flag
ansible.builtin.command: ceph osd set noout
delegate_to: <your_monitor>
- name: Turn off the server
ansible.builtin.command: poweroff
- name: Wait for the server to go down
local_action:
module: wait_for
host: <your_host>
port: 22
state: stopped
- name: Wait for the server to come up
local_action:
module: wait_for
host: <your_host>
port: 22
delay: 10
timeout: 3600
- name: Unset the noout flag
ansible.builtin.command: ceph osd unset noout
delegate_to: <your_monitor>

View File

@ -0,0 +1,552 @@
---
# This playbook was meant to upgrade a node from Ubuntu to RHEL.
# We are performing a set of actions prior to reboot the node.
# The node reboots via PXE and gets its new operating system.
# This playbook only works for monitors and OSDs.
# Note that some of the checks are ugly:
# ie: the when migration_completed.stat.exists
# can be improved with includes, however I wanted to keep a single file...
#
- hosts: mons
serial: 1
sudo: true
vars:
backup_dir: /tmp/
tasks:
- name: Check if the node has be migrated already
ansible.builtin.stat: >
path=/var/lib/ceph/mon/ceph-{{ ansible_facts['hostname'] }}/migration_completed
register: migration_completed
failed_when: false
- name: Check for failed run
ansible.builtin.stat: >
path=/var/lib/ceph/{{ ansible_facts['hostname'] }}.tar
register: mon_archive_leftover
- fail: msg="Looks like an archive is already there, please remove it!"
when: migration_completed.stat.exists == False and mon_archive_leftover.stat.exists == True
- name: Compress the store as much as possible
ansible.builtin.command: ceph tell mon.{{ ansible_facts['hostname'] }} compact
when: migration_completed.stat.exists == False
- name: Check if sysvinit
ansible.builtin.stat: >
path=/var/lib/ceph/mon/ceph-{{ ansible_facts['hostname'] }}/sysvinit
register: monsysvinit
changed_when: false
- name: Check if upstart
ansible.builtin.stat: >
path=/var/lib/ceph/mon/ceph-{{ ansible_facts['hostname'] }}/upstart
register: monupstart
changed_when: false
- name: Check if init does what it is supposed to do (Sysvinit)
ansible.builtin.shell: >
ps faux|grep -sq [c]eph-mon && service ceph status mon >> /dev/null
register: ceph_status_sysvinit
changed_when: false
# can't complete the condition since the previous taks never ran...
- fail: msg="Something is terribly wrong here, sysvinit is configured, the service is started BUT the init script does not return 0, GO FIX YOUR SETUP!"
when: ceph_status_sysvinit.rc != 0 and migration_completed.stat.exists == False and monsysvinit.stat.exists == True
- name: Check if init does what it is supposed to do (upstart)
ansible.builtin.shell: >
ps faux|grep -sq [c]eph-mon && status ceph-mon-all >> /dev/null
register: ceph_status_upstart
changed_when: false
- fail: msg="Something is terribly wrong here, upstart is configured, the service is started BUT the init script does not return 0, GO FIX YOUR SETUP!"
when: ceph_status_upstart.rc != 0 and migration_completed.stat.exists == False and monupstart.stat.exists == True
- name: Restart the Monitor after compaction (Upstart)
service: >
name=ceph-mon
state=restarted
args=id={{ ansible_facts['hostname'] }}
when: monupstart.stat.exists == True and migration_completed.stat.exists == False
- name: Restart the Monitor after compaction (Sysvinit)
service: >
name=ceph
state=restarted
args=mon
when: monsysvinit.stat.exists == True and migration_completed.stat.exists == False
- name: Wait for the monitor to be up again
local_action:
module: wait_for
host: "{{ ansible_ssh_host | default(inventory_hostname) }}"
port: 6789
timeout: 10
when: migration_completed.stat.exists == False
- name: Stop the monitor (Upstart)
service: >
name=ceph-mon
state=stopped
args=id={{ ansible_facts['hostname'] }}
when: monupstart.stat.exists == True and migration_completed.stat.exists == False
- name: Stop the monitor (Sysvinit)
service: >
name=ceph
state=stopped
args=mon
when: monsysvinit.stat.exists == True and migration_completed.stat.exists == False
- name: Wait for the monitor to be down
local_action:
module: wait_for
host: "{{ ansible_ssh_host | default(inventory_hostname) }}"
port: 6789
timeout: 10
state: stopped
when: migration_completed.stat.exists == False
- name: Create a backup directory
file: >
path={{ backup_dir }}/monitors-backups
state=directory
owner=root
group=root
mode=0644
delegate_to: "{{ item }}"
with_items: "{{ groups.backup[0] }}"
when: migration_completed.stat.exists == False
# NOTE (leseb): should we convert upstart to sysvinit here already?
- name: Archive monitor stores
ansible.builtin.shell: >
tar -cpvzf - --one-file-system . /etc/ceph/* | cat > {{ ansible_facts['hostname'] }}.tar
chdir=/var/lib/ceph/
creates={{ ansible_facts['hostname'] }}.tar
when: migration_completed.stat.exists == False
- name: Scp the Monitor store
fetch: >
src=/var/lib/ceph/{{ ansible_facts['hostname'] }}.tar
dest={{ backup_dir }}/monitors-backups/{{ ansible_facts['hostname'] }}.tar
flat=yes
when: migration_completed.stat.exists == False
- name: Reboot the server
ansible.builtin.command: reboot
when: migration_completed.stat.exists == False
- name: Wait for the server to come up
local_action:
module: wait_for
port: 22
delay: 10
timeout: 3600
when: migration_completed.stat.exists == False
- name: Wait a bit more to be sure that the server is ready
pause: seconds=20
when: migration_completed.stat.exists == False
- name: Check if sysvinit
ansible.builtin.stat: >
path=/var/lib/ceph/mon/ceph-{{ ansible_facts['hostname'] }}/sysvinit
register: monsysvinit
changed_when: false
- name: Check if upstart
ansible.builtin.stat: >
path=/var/lib/ceph/mon/ceph-{{ ansible_facts['hostname'] }}/upstart
register: monupstart
changed_when: false
- name: Make sure the monitor is stopped (Upstart)
service: >
name=ceph-mon
state=stopped
args=id={{ ansible_facts['hostname'] }}
when: monupstart.stat.exists == True and migration_completed.stat.exists == False
- name: Make sure the monitor is stopped (Sysvinit)
service: >
name=ceph
state=stopped
args=mon
when: monsysvinit.stat.exists == True and migration_completed.stat.exists == False
# NOTE (leseb): 'creates' was added in Ansible 1.6
- name: Copy and unarchive the monitor store
unarchive: >
src={{ backup_dir }}/monitors-backups/{{ ansible_facts['hostname'] }}.tar
dest=/var/lib/ceph/
copy=yes
mode=0600
creates=etc/ceph/ceph.conf
when: migration_completed.stat.exists == False
- name: Copy keys and configs
ansible.builtin.shell: >
cp etc/ceph/* /etc/ceph/
chdir=/var/lib/ceph/
when: migration_completed.stat.exists == False
- name: Configure RHEL7 for sysvinit
ansible.builtin.shell: find -L /var/lib/ceph/mon/ -mindepth 1 -maxdepth 1 -regextype posix-egrep -regex '.*/[A-Za-z0-9]+-[A-Za-z0-9._-]+' -exec touch {}/sysvinit \; -exec rm {}/upstart \;
when: migration_completed.stat.exists == False
# NOTE (leseb): at this point the upstart and sysvinit checks are not necessary
# so we directly call sysvinit
- name: Start the monitor
service: >
name=ceph
state=started
args=mon
when: migration_completed.stat.exists == False
- name: Wait for the Monitor to be up again
local_action:
module: wait_for
host: "{{ ansible_ssh_host | default(inventory_hostname) }}"
port: 6789
timeout: 10
when: migration_completed.stat.exists == False
- name: Waiting for the monitor to join the quorum...
ansible.builtin.shell: >
ceph -s | grep monmap | sed 's/.*quorum//' | egrep -q {{ ansible_facts['hostname'] }}
register: result
until: result.rc == 0
retries: 5
delay: 10
delegate_to: "{{ item }}"
with_items: "{{ groups.backup[0] }}"
when: migration_completed.stat.exists == False
- name: Done moving to the next monitor
file: >
path=/var/lib/ceph/mon/ceph-{{ ansible_facts['hostname'] }}/migration_completed
state=touch
owner=root
group=root
mode=0600
when: migration_completed.stat.exists == False
- hosts: osds
serial: 1
sudo: true
vars:
backup_dir: /tmp/
tasks:
- name: Check if the node has be migrated already
ansible.builtin.stat: >
path=/var/lib/ceph/migration_completed
register: migration_completed
failed_when: false
- name: Check for failed run
ansible.builtin.stat: >
path=/var/lib/ceph/{{ ansible_facts['hostname'] }}.tar
register: osd_archive_leftover
- fail: msg="Looks like an archive is already there, please remove it!"
when: migration_completed.stat.exists == False and osd_archive_leftover.stat.exists == True
- name: Check if init does what it is supposed to do (Sysvinit)
ansible.builtin.shell: >
ps faux|grep -sq [c]eph-osd && service ceph status osd >> /dev/null
register: ceph_status_sysvinit
changed_when: false
# can't complete the condition since the previous taks never ran...
- fail: msg="Something is terribly wrong here, sysvinit is configured, the services are started BUT the init script does not return 0, GO FIX YOUR SETUP!"
when: ceph_status_sysvinit.rc != 0 and migration_completed.stat.exists == False and monsysvinit.stat.exists == True
- name: Check if init does what it is supposed to do (upstart)
ansible.builtin.shell: >
ps faux|grep -sq [c]eph-osd && initctl list|egrep -sq "ceph-osd \(ceph/.\) start/running, process [0-9][0-9][0-9][0-9]"
register: ceph_status_upstart
changed_when: false
- fail: msg="Something is terribly wrong here, upstart is configured, the services are started BUT the init script does not return 0, GO FIX YOUR SETUP!"
when: ceph_status_upstart.rc != 0 and migration_completed.stat.exists == False and monupstart.stat.exists == True
- name: Set the noout flag
ansible.builtin.command: ceph osd set noout
delegate_to: "{{ item }}"
with_items: "{{ groups[mon_group_name][0] }}"
when: migration_completed.stat.exists == False
- name: Check if sysvinit
ansible.builtin.shell: stat /var/lib/ceph/osd/ceph-*/sysvinit
register: osdsysvinit
failed_when: false
changed_when: false
- name: Check if upstart
ansible.builtin.shell: stat /var/lib/ceph/osd/ceph-*/upstart
register: osdupstart
failed_when: false
changed_when: false
- name: Archive ceph configs
ansible.builtin.shell: >
tar -cpvzf - --one-file-system . /etc/ceph/ceph.conf | cat > {{ ansible_facts['hostname'] }}.tar
chdir=/var/lib/ceph/
creates={{ ansible_facts['hostname'] }}.tar
when: migration_completed.stat.exists == False
- name: Create backup directory
file: >
path={{ backup_dir }}/osds-backups
state=directory
owner=root
group=root
mode=0644
delegate_to: "{{ item }}"
with_items: "{{ groups.backup[0] }}"
when: migration_completed.stat.exists == False
- name: Scp OSDs dirs and configs
fetch: >
src=/var/lib/ceph/{{ ansible_facts['hostname'] }}.tar
dest={{ backup_dir }}/osds-backups/
flat=yes
when: migration_completed.stat.exists == False
- name: Collect OSD ports
ansible.builtin.shell: netstat -tlpn | awk -F ":" '/ceph-osd/ { sub (" .*", "", $2); print $2 }' | uniq
register: osd_ports
when: migration_completed.stat.exists == False
- name: Gracefully stop the OSDs (Upstart)
service: >
name=ceph-osd-all
state=stopped
when: osdupstart.rc == 0 and migration_completed.stat.exists == False
- name: Gracefully stop the OSDs (Sysvinit)
service: >
name=ceph
state=stopped
args=mon
when: osdsysvinit.rc == 0 and migration_completed.stat.exists == False
- name: Wait for the OSDs to be down
local_action:
module: wait_for
host: "{{ ansible_ssh_host | default(inventory_hostname) }}"
port: {{ item }}
timeout: 10
state: stopped
with_items: "{{ osd_ports.stdout_lines }}"
when: migration_completed.stat.exists == False
- name: Configure RHEL with sysvinit
ansible.builtin.shell: find -L /var/lib/ceph/osd/ -mindepth 1 -maxdepth 1 -regextype posix-egrep -regex '.*/[A-Za-z0-9]+-[A-Za-z0-9._-]+' -exec touch {}/sysvinit \; -exec rm {}/upstart \;
when: migration_completed.stat.exists == False
- name: Reboot the server
ansible.builtin.command: reboot
when: migration_completed.stat.exists == False
- name: Wait for the server to come up
local_action:
module: wait_for
port: 22
delay: 10
timeout: 3600
when: migration_completed.stat.exists == False
- name: Wait a bit to be sure that the server is ready for scp
pause: seconds=20
when: migration_completed.stat.exists == False
# NOTE (leseb): 'creates' was added in Ansible 1.6
- name: Copy and unarchive the OSD configs
unarchive: >
src={{ backup_dir }}/osds-backups/{{ ansible_facts['hostname'] }}.tar
dest=/var/lib/ceph/
copy=yes
mode=0600
creates=etc/ceph/ceph.conf
when: migration_completed.stat.exists == False
- name: Copy keys and configs
ansible.builtin.shell: >
cp etc/ceph/* /etc/ceph/
chdir=/var/lib/ceph/
when: migration_completed.stat.exists == False
# NOTE (leseb): at this point the upstart and sysvinit checks are not necessary
# so we directly call sysvinit
- name: Start all the OSDs
service: >
name=ceph-osd-all
state=started
args=osd
when: migration_completed.stat.exists == False
# NOTE (leseb): this is tricky unless this is set into the ceph.conf
# listened ports can be predicted, thus they will change after each restart
# - name: Wait for the OSDs to be up again
# local_action: >
# wait_for
# host={{ ansible_ssh_host | default(inventory_hostname) }}
# port={{ item }}
# timeout=30
# with_items:
# - "{{ osd_ports.stdout_lines }}"
- name: Waiting for clean PGs...
ansible.builtin.shell: >
test "[""$(ceph -s -f json | python -c 'import sys, json; print(json.load(sys.stdin)["pgmap"]["num_pgs"])')""]" = "$(ceph -s -f json | python -c 'import sys, json; print([ i["count"] for i in json.load(sys.stdin)["pgmap"]["pgs_by_state"] if i["state_name"] == "active+clean"])')"
register: result
until: result.rc == 0
retries: 10
delay: 10
delegate_to: "{{ item }}"
with_items: "{{ groups.backup[0] }}"
when: migration_completed.stat.exists == False
- name: Done moving to the next OSD
file: >
path=/var/lib/ceph/migration_completed
state=touch
owner=root
group=root
mode=0600
when: migration_completed.stat.exists == False
- name: Unset the noout flag
ansible.builtin.command: ceph osd unset noout
delegate_to: "{{ item }}"
with_items: "{{ groups[mon_group_name][0] }}"
when: migration_completed.stat.exists == False
- hosts: rgws
serial: 1
sudo: true
vars:
backup_dir: /tmp/
tasks:
- name: Check if the node has be migrated already
ansible.builtin.stat: >
path=/var/lib/ceph/radosgw/migration_completed
register: migration_completed
failed_when: false
- name: Check for failed run
ansible.builtin.stat: >
path=/var/lib/ceph/{{ ansible_facts['hostname'] }}.tar
register: rgw_archive_leftover
- fail: msg="Looks like an archive is already there, please remove it!"
when: migration_completed.stat.exists == False and rgw_archive_leftover.stat.exists == True
- name: Archive rados gateway configs
ansible.builtin.shell: >
tar -cpvzf - --one-file-system . /etc/ceph/* | cat > {{ ansible_facts['hostname'] }}.tar
chdir=/var/lib/ceph/
creates={{ ansible_facts['hostname'] }}.tar
when: migration_completed.stat.exists == False
- name: Create backup directory
file: >
path={{ backup_dir }}/rgws-backups
state=directory
owner=root
group=root
mode=0644
delegate_to: "{{ item }}"
with_items: "{{ groups.backup[0] }}"
when: migration_completed.stat.exists == False
- name: Scp RGWs dirs and configs
fetch: >
src=/var/lib/ceph/{{ ansible_facts['hostname'] }}.tar
dest={{ backup_dir }}/rgws-backups/
flat=yes
when: migration_completed.stat.exists == False
- name: Gracefully stop the rados gateway
service: >
name={{ item }}
state=stopped
with_items: radosgw
when: migration_completed.stat.exists == False
- name: Wait for radosgw to be down
local_action:
module: wait_for
host: "{{ ansible_ssh_host | default(inventory_hostname) }}"
path: /tmp/radosgw.sock
state: absent
timeout: 30
when: migration_completed.stat.exists == False
- name: Reboot the server
ansible.builtin.command: reboot
when: migration_completed.stat.exists == False
- name: Wait for the server to come up
local_action:
module: wait_for
port: 22
delay: 10
timeout: 3600
when: migration_completed.stat.exists == False
- name: Wait a bit to be sure that the server is ready for scp
pause: seconds=20
when: migration_completed.stat.exists == False
# NOTE (leseb): 'creates' was added in Ansible 1.6
- name: Copy and unarchive the OSD configs
unarchive: >
src={{ backup_dir }}/rgws-backups/{{ ansible_facts['hostname'] }}.tar
dest=/var/lib/ceph/
copy=yes
mode=0600
creates=etc/ceph/ceph.conf
when: migration_completed.stat.exists == False
- name: Copy keys and configs
ansible.builtin.shell: >
{{ item }}
chdir=/var/lib/ceph/
with_items: cp etc/ceph/* /etc/ceph/
when: migration_completed.stat.exists == False
- name: Start rados gateway
service: >
name={{ item }}
state=started
with_items: radosgw
when: migration_completed.stat.exists == False
- name: Wait for radosgw to be up again
local_action:
module: wait_for
host: "{{ ansible_ssh_host | default(inventory_hostname) }}"
path: /tmp/radosgw.sock
state: present
timeout: 30
when: migration_completed.stat.exists == False
- name: Done moving to the next rados gateway
file: >
path=/var/lib/ceph/radosgw/migration_completed
state=touch
owner=root
group=root
mode=0600
when: migration_completed.stat.exists == False

View File

@ -0,0 +1,97 @@
---
# This playbook will make custom partition layout for your osd hosts.
# You should define `devices` variable for every host.
#
# For example, in host_vars/hostname1
#
# devices:
# - device_name: sdb
# partitions:
# - index: 1
# size: 10G
# type: data
# - index: 2
# size: 5G
# type: journal
# - device_name: sdc
# partitions:
# - index: 1
# size: 10G
# type: data
# - index: 2
# size: 5G
# type: journal
#
- vars:
osd_group_name: osds
journal_typecode: 45b0969e-9b03-4f30-b4c6-b4b80ceff106
data_typecode: 4fbd7e29-9d25-41b8-afd0-062c0ceff05d
devices: []
hosts: "{{ osd_group_name }}"
tasks:
- name: Load a variable file for devices partition
include_vars: "{{ item }}"
with_first_found:
- files:
- "host_vars/{{ ansible_facts['hostname'] }}.yml"
- "host_vars/default.yml"
skip: true
- name: Exit playbook, if devices not defined
ansible.builtin.fail:
msg: "devices must be define in host_vars/default.yml or host_vars/{{ ansible_facts['hostname'] }}.yml"
when: devices is not defined
- name: Install sgdisk(gdisk)
ansible.builtin.package:
name: gdisk
state: present
register: result
until: result is succeeded
- name: Erase all previous partitions(dangerous!!!)
ansible.builtin.shell: sgdisk --zap-all -- /dev/{{item.device_name}}
with_items: "{{ devices }}"
- name: Make osd partitions
ansible.builtin.shell: >
sgdisk --new={{item.1.index}}:0:+{{item.1.size}} "--change-name={{item.1.index}}:ceph {{item.1.type}}"
"--typecode={{item.1.index}}:{% if item.1.type=='data' %}{{data_typecode}}{% else %}{{journal_typecode}}{% endif %}"
--mbrtogpt -- /dev/{{item.0.device_name}}
with_subelements:
- "{{ devices }}"
- partitions
- set_fact:
owner: 167
group: 167
when: ansible_facts['os_family'] == "RedHat"
- set_fact:
owner: 64045
group: 64045
when: ansible_facts['os_family'] == "Debian"
- name: Change partitions ownership
ansible.builtin.file:
path: "/dev/{{item.0.device_name}}{{item.1.index}}"
owner: "{{ owner | default('root')}}"
group: "{{ group | default('disk')}}"
with_subelements:
- "{{ devices }}"
- partitions
when:
item.0.device_name | match('/dev/([hsv]d[a-z]{1,2}){1,2}$')
- name: Change partitions ownership
ansible.builtin.file:
path: "/dev/{{item.0.device_name}}p{{item.1.index}}"
owner: "{{ owner | default('root')}}"
group: "{{ group | default('disk')}}"
with_subelements:
- "{{ devices }}"
- partitions
when: item.0.device_name | match('/dev/(cciss/c[0-9]d[0-9]|nvme[0-9]n[0-9]){1,2}$')
...

View File

@ -0,0 +1,105 @@
---
# This playbook use to migrate activity osd(s) journal to SSD.
#
# You should define `osds_journal_devices` variable for host which osd(s) journal migrate to.
#
# For example in host_vars/hostname1.yml
#
# osds_journal_devices:
# - device_name: /dev/sdd
# partitions:
# - index: 1
# size: 10G
# osd_id: 0
# - index: 2
# size: 10G
# osd_id: 1
# - device_name: /dev/sdf
# partitions:
# - index: 1
# size: 10G
# osd_id: 2
#
# @param device_name: The full device path of new ssd.
# @param partitions: The custom partition layout of ssd.
# @param index: The index of this partition.
# @param size: The size of this partition.
# @param osd_id: Which osds's journal this partition for.
#
# ansible-playbook migrate-journal-to-ssd.yml
# The playbook will migrate osd(s) journal to ssd device which you define in host_vars.
- vars:
osd_group_name: osds
journal_typecode: 45b0969e-9b03-4f30-b4c6-b4b80ceff106
osds_journal_devices: []
hosts: "{{ osd_group_name }}"
serial: 1
tasks:
- name: Get osd(s) if directory stat
ansible.builtin.stat:
path: "/var/lib/ceph/osd/{{ cluster }}-{{ item.1.osd_id }}/journal_uuid"
register: osds_dir_stat
with_subelements:
- "{{ osds_journal_devices }}"
- partitions
- name: Exit playbook osd(s) is not on this host
ansible.builtin.fail:
msg: exit playbook osd(s) is not on this host
with_items:
osds_dir_stat.results
when: osds_dir_stat is defined and item.stat.exists == false
- name: Install sgdisk(gdisk)
ansible.builtin.package:
name: gdisk
state: present
register: result
until: result is succeeded
when: osds_journal_devices is defined
- name: Generate uuid for osds journal
ansible.builtin.command: uuidgen
register: osds
with_subelements:
- "{{ osds_journal_devices }}"
- partitions
- name: Make osd partitions on ssd
ansible.builtin.shell: >
sgdisk --new={{item.item[1].index}}:0:+{{item.item[1].size}} "--change-name={{ item.item[1].index }}:ceph journal"
--typecode={{ item.item[1].index }}:{{ journal_typecode }}
--partition-guid={{ item.item[1].index }}:{{ item.stdout }}
--mbrtogpt -- {{ item.item[0].device_name }}
with_items: "{{ osds.results }}"
- name: Stop osd(s) service
ansible.builtin.service:
name: "ceph-osd@{{ item.item[1].osd_id }}"
state: stopped
with_items: "{{ osds.results }}"
- name: Flush osd(s) journal
ansible.builtin.command: ceph-osd -i {{ item.item[1].osd_id }} --flush-journal --cluster {{ cluster }}
with_items: "{{ osds.results }}"
when: osds_journal_devices is defined
- name: Update osd(s) journal soft link
ansible.builtin.command: ln -sf /dev/disk/by-partuuid/{{ item.stdout }} /var/lib/ceph/osd/{{ cluster }}-{{ item.item[1].osd_id }}/journal
with_items: "{{ osds.results }}"
- name: Update osd(s) journal uuid
ansible.builtin.command: echo {{ item.stdout }} > /var/lib/ceph/osd/{{ cluster }}-{{ item.item[1].osd_id }}/journal_uuid
with_items: "{{ osds.results }}"
- name: Initialize osd(s) new journal
ansible.builtin.command: ceph-osd -i {{ item.item[1].osd_id }} --mkjournal --cluster {{ cluster }}
with_items: "{{ osds.results }}"
- name: Start osd(s) service
ansible.builtin.service:
name: "ceph-osd@{{ item.item[1].osd_id }}"
state: started
with_items: "{{ osds.results }}"

View File

@ -0,0 +1,11 @@
---
# Nukes a multisite config
- hosts: rgws
become: true
tasks:
- include_tasks: roles/ceph-rgw/tasks/multisite/destroy.yml
handlers:
# Ansible 2.1.0 bug will ignore included handlers without this
- name: Import_tasks roles/ceph-rgw/handlers/main.yml
import_tasks: roles/ceph-rgw/handlers/main.yml

View File

@ -0,0 +1,115 @@
---
# This playbook use to recover Ceph OSDs after ssd journal failure.
# You will also realise that its really simple to bring your
# OSDs back to life after replacing your faulty SSD with a new one.
#
# You should define `dev_ssds` variable for host which changes ssds after
# failure.
#
# For example in host_vars/hostname1.yml
#
# dev_ssds:
# - device_name: /dev/sdd
# partitions:
# - index: 1
# size: 10G
# osd_id: 0
# - index: 2
# size: 10G
# osd_id: 1
# - device_name: /dev/sdf
# partitions:
# - index: 1
# size: 10G
# osd_id: 2
#
# @param device_name: The full device path of new ssd
# @param partitions: The custom partition layout of new ssd
# @param index: The index of this partition
# @param size: The size of this partition
# @param osd_id: Which osds's journal this partition for.
#
# ansible-playbook recover-osds-after-ssd-journal-failure.yml
# Prompts for select which host to recover, defaults to null,
# doesn't select host the recover ssd. Input the hostname
# which to recover osds after ssd journal failure
#
# ansible-playbook -e target_host=hostname \
# recover-osds-after-ssd-journal-failure.yml
# Overrides the prompt using -e option. Can be used in
# automation scripts to avoid interactive prompt.
- hosts: localhost
gather_facts: false
vars_prompt:
- name: target_host # noqa: name[casing]
prompt: please enter the target hostname which to recover osds after ssd journal failure
private: false
tasks:
- add_host:
name: "{{ target_host }}"
groups: dynamically_created_hosts
- hosts: dynamically_created_hosts
vars:
journal_typecode: 45b0969e-9b03-4f30-b4c6-b4b80ceff106
dev_ssds: []
tasks:
- fail: msg="please define dev_ssds variable"
when: dev_ssds|length <= 0
- name: Get osd(s) if directory stat
ansible.builtin.stat:
path: "/var/lib/ceph/osd/{{ cluster }}-{{ item.1.osd_id }}/journal_uuid"
register: osds_dir_stat
with_subelements:
- "{{ dev_ssds }}"
- partitions
- name: Exit playbook osd(s) is not on this host
ansible.builtin.fail:
msg: exit playbook osds is not no this host
with_items:
osds_dir_stat.results
when:
- osds_dir_stat is defined | bool
- item.stat.exists == false
- name: Install sgdisk(gdisk)
ansible.builtin.package:
name: gdisk
state: present
register: result
until: result is succeeded
- name: Get osd(s) journal uuid
ansible.builtin.command: cat "/var/lib/ceph/osd/{{ cluster }}-{{ item.1.osd_id }}/journal_uuid"
register: osds_uuid
with_subelements:
- "{{ dev_ssds }}"
- partitions
- name: Make partitions on new ssd
ansible.builtin.shell: >
sgdisk --new={{item.item[1].index}}:0:+{{item.item[1].size}} "--change-name={{ item.item[1].index }}:ceph journal"
--typecode={{ item.item[1].index }}:{{ journal_typecode }}
--partition-guid={{ item.item[1].index }}:{{ item.stdout }}
--mbrtogpt -- {{ item.item[0].device_name }}
with_items: "{{ osds_uuid.results }}"
- name: Stop osd(s) service
ansible.builtin.service:
name: "ceph-osd@{{ item.item[1].osd_id }}"
state: stopped
with_items: "{{ osds_uuid.results }}"
- name: Reinitialize osd(s) journal in new ssd
ansible.builtin.command: ceph-osd -i {{ item.item[1].osd_id }} --mkjournal --cluster {{ cluster }}
with_items: "{{ osds_uuid.results }}"
- name: Start osd(s) service
ansible.builtin.service:
name: "ceph-osd@{{ item.item[1].osd_id }}"
state: started
with_items: "{{ osds_uuid.results }}"

View File

@ -0,0 +1,190 @@
---
# This playbook replaces Ceph OSDs.
# It can replace any number of OSD(s) from the cluster and ALL THEIR DATA
#
# When disks fail, or if an admnistrator wants to reprovision OSDs with a new backend,
# for instance, for switching from FileStore to BlueStore, OSDs need to be replaced.
# Unlike Removing the OSD, replaced OSDs id and CRUSH map entry need to be keep intact after the OSD is destroyed for replacement.
#
# Use it like this:
# ansible-playbook replace-osd.yml -e osd_to_replace=0,2,6
# Prompts for confirmation to replace, defaults to no and
# doesn't replace the osd(s). yes replaces the osd(s).
#
# ansible-playbook -e ireallymeanit=yes|no replace-osd.yml
# Overrides the prompt using -e option. Can be used in
# automation scripts to avoid interactive prompt.
- name: Gather facts and check the init system
hosts:
- "{{ mon_group_name|default('mons') }}"
- "{{ osd_group_name|default('osds') }}"
become: true
tasks:
- ansible.builtin.debug: msg="gather facts on all Ceph hosts for following reference"
- name: Confirm whether user really meant to replace osd(s)
hosts: localhost
become: true
vars_prompt:
- name: ireallymeanit # noqa: name[casing]
prompt: Are you sure you want to replace the osd(s)?
default: 'no'
private: false
vars:
mon_group_name: mons
osd_group_name: osds
pre_tasks:
- name: Exit playbook, if user did not mean to replace the osd(s)
ansible.builtin.fail:
msg: "Exiting replace-osd playbook, no osd(s) was/were replaced..
To replace the osd(s), either say 'yes' on the prompt or
or use `-e ireallymeanit=yes` on the command line when
invoking the playbook"
when: ireallymeanit != 'yes'
- name: Exit playbook, if no osd(s) was/were given
ansible.builtin.fail:
msg: "osd_to_replace must be declared
Exiting replace-osd playbook, no OSD(s) was/were replaced.
On the command line when invoking the playbook, you can use
-e osd_to_replace=0,1,2,3 argument."
when: osd_to_replace is not defined
tasks:
- ansible.builtin.import_role:
name: ceph-defaults
post_tasks:
- name: Set_fact container_exec_cmd build docker exec command (containerized)
ansible.builtin.set_fact:
container_exec_cmd: "docker exec ceph-mon-{{ hostvars[groups[mon_group_name][0]]['ansible_facts']['hostname'] }}"
when: containerized_deployment | bool
- name: Exit playbook, if can not connect to the cluster
ansible.builtin.command: "{{ container_exec_cmd | default('') }} timeout 5 ceph --cluster {{ cluster }} health"
register: ceph_health
until: ceph_health.stdout.find("HEALTH") > -1
delegate_to: "{{ groups[mon_group_name][0] }}"
retries: 5
delay: 2
- name: Find the host(s) where the osd(s) is/are running on
ansible.builtin.command: "{{ container_exec_cmd | default('') }} ceph --cluster {{ cluster }} osd find {{ item }}"
with_items: "{{ osd_to_replace.split(',') }}"
delegate_to: "{{ groups[mon_group_name][0] }}"
register: find_osd_hosts
- name: Set_fact osd_hosts
ansible.builtin.set_fact:
osd_hosts: "{{ osd_hosts | default([]) + [ (item.stdout | from_json).crush_location.host ] }}"
with_items: "{{ find_osd_hosts.results }}"
- name: Check if ceph admin key exists on the osd nodes
ansible.builtin.stat:
path: "/etc/ceph/{{ cluster }}.client.admin.keyring"
register: ceph_admin_key
with_items: "{{ osd_hosts }}"
delegate_to: "{{ item }}"
failed_when: false
when: not containerized_deployment | bool
- name: Fail when admin key is not present
ansible.builtin.fail:
msg: "The Ceph admin key is not present on the OSD node, please add it and remove it after the playbook is done."
with_items: "{{ ceph_admin_key.results }}"
when:
- not containerized_deployment | bool
- item.stat.exists == false
# NOTE(leseb): using '>' is the only way I could have the command working
- name: Find osd device based on the id
ansible.builtin.shell: >
docker run --privileged=true -v /dev:/dev --entrypoint /usr/sbin/ceph-disk
{{ ceph_docker_registry}}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}
list | awk -v pattern=osd.{{ item.1 }} '$0 ~ pattern {print $1}'
with_together:
- "{{ osd_hosts }}"
- "{{ osd_to_replace.split(',') }}"
register: osd_to_replace_disks
delegate_to: "{{ item.0 }}"
when: containerized_deployment | bool
- name: Zapping osd(s) - container
ansible.builtin.shell: >
docker run --privileged=true -v /dev:/dev --entrypoint /usr/sbin/ceph-disk
{{ ceph_docker_registry}}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}
zap {{ item.1 }}
run_once: true
with_together:
- "{{ osd_hosts }}"
- "{{ osd_to_replace_disks.results }}"
delegate_to: "{{ item.0 }}"
when: containerized_deployment | bool
- name: Zapping osd(s) - non container
ansible.builtin.command: ceph-disk zap --cluster {{ cluster }} {{ item.1 }}
run_once: true
with_together:
- "{{ osd_hosts }}"
- "{{ osd_to_replace_disks.results }}"
delegate_to: "{{ item.0 }}"
when: not containerized_deployment | bool
- name: Destroying osd(s)
ansible.builtin.command: ceph-disk destroy --cluster {{ cluster }} --destroy-by-id {{ item.1 }} --zap
run_once: true
with_together:
- "{{ osd_hosts }}"
- "{{ osd_to_replace.split(',') }}"
delegate_to: "{{ item.0 }}"
when: not containerized_deployment | bool
- name: Replace osd(s) - prepare - non container
ansible.builtin.command: ceph-disk prepare {{ item.1 }} --osd-id {{ item.2 }} --osd-uuid $(uuidgen)
run_once: true
delegate_to: "{{ item.0 }}"
with_together:
- "{{ osd_hosts }}"
- "{{ osd_to_replace_disks.results }}"
- "{{ osd_to_replace.split(',') }}"
- name: Replace osd(s) - prepare - container
ansible.builtin.shell: >
docker run --privileged=true -v /dev:/dev --entrypoint /usr/sbin/ceph-disk
{{ ceph_docker_registry}}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}
prepare {{ item.1 }}
run_once: true
delegate_to: "{{ item.0 }}"
with_together:
- "{{ osd_hosts }}"
- "{{ osd_to_replace_disks.results }}"
- name: Replace osd(s) - activate - non container
ansible.builtin.command: ceph-disk activate {{ item.1 }}1
run_once: true
delegate_to: "{{ item.0 }}"
with_together:
- "{{ osd_hosts }}"
- "{{ osd_to_replace_disks.results }}"
- name: Replace osd(s) - activate - container
ansible.builtin.shell: >
docker run --privileged=true -v /dev:/dev --entrypoint /usr/sbin/ceph-disk
{{ ceph_docker_registry}}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}
activate {{ item.1 }}1
run_once: true
delegate_to: "{{ item.0 }}"
with_together:
- "{{ osd_hosts }}"
- "{{ osd_to_replace_disks.results }}"
- name: Show ceph health
ansible.builtin.command: "{{ container_exec_cmd | default('') }} ceph --cluster {{ cluster }} -s"
delegate_to: "{{ groups[mon_group_name][0] }}"
- name: Show ceph osd tree
ansible.builtin.command: "{{ container_exec_cmd | default('') }} ceph --cluster {{ cluster }} osd tree"
delegate_to: "{{ groups[mon_group_name][0] }}"

View File

@ -0,0 +1,57 @@
# This file configures logical volume creation for FS Journals on NVMe, a NVMe based bucket index, and HDD based OSDs.
# This playbook configures one NVMe device at a time. If your OSD systems contain multiple NVMe devices, you will need to edit the key variables ("nvme_device", "hdd_devices") for each run.
# It is meant to be used when osd_objectstore=filestore and it outputs the necessary input for group_vars/osds.yml.
# The LVs for journals are created first then the LVs for data. All LVs for journals correspond to a LV for data.
#
## CHANGE THESE VARS ##
#
# The NVMe device and the hdd devices must be raw and not have any GPT, FS, or RAID signatures.
# GPT, FS, & RAID signatures should be removed from a device prior to running the lv-create.yml playbook.
#
# Having leftover signatures can result in ansible errors that say "device $device_name excluded by a filter" after running the lv-create.yml playbook.
# This can be done by running `wipefs -a $device_name`.
# Path of nvme device primed for LV creation for journals and data. Only one NVMe device is allowed at a time. Providing a list will not work in this case.
nvme_device: dummy
# Path of hdd devices designated for LV creation.
hdd_devices:
- /dev/sdd
- /dev/sde
- /dev/sdf
- /dev/sdg
- /dev/sdh
# Per the lvol module documentation, "size" and "journal_size" is the size of the logical volume, according to lvcreate(8) --size.
# This is by default in megabytes or optionally with one of [bBsSkKmMgGtTpPeE] units; or according to lvcreate(8) --extents as a percentage of [VG|PVS|FREE]; Float values must begin with a digit.
# For further reading and examples see: https://docs.ansible.com/ansible/2.6/modules/lvol_module.html
# Suggested journal size is 5500
journal_size: 5500
# This var is a list of bucket index LVs created on the NVMe device. We recommend one be created but you can add others
nvme_device_lvs:
- lv_name: "ceph-bucket-index-1"
size: 100%FREE
journal_name: "ceph-journal-bucket-index-1-{{ nvme_device_basename }}"
## TYPICAL USERS WILL NOT NEED TO CHANGE VARS FROM HERE DOWN ##
# the path to where to save the logfile for lv-create.yml
logfile_path: ./lv-create.log
# all hdd's have to be the same size and the LVs on them are dedicated for OSD data
hdd_lv_size: 100%FREE
# Since this playbook can be run multiple times across different devices, {{ var.split('/')[-1] }} is used quite frequently in this play-book.
# This is used to strip the device name away from its path (ex: sdc from /dev/sdc) to differenciate the names of vgs, journals, or lvs if the prefixes are not changed across multiple runs.
nvme_device_basename: "{{ nvme_device.split('/')[-1] }}"
# Only one volume group is created in the playbook for all the LVs on NVMe. This volume group takes up the entire device specified in "nvme_device".
nvme_vg_name: "ceph-nvme-vg-{{ nvme_device_basename }}"
hdd_vg_prefix: "ceph-hdd-vg"
hdd_lv_prefix: "ceph-hdd-lv"
hdd_journal_prefix: "ceph-journal"
# Journals are created on NVMe device

0
library/__init__.py Normal file
View File

Binary file not shown.

Binary file not shown.

Some files were not shown because too many files have changed in this diff Show More