Skip to main content

TripleO Quickstart deployments on baremetal using TOAD

This article is going to cover how to deploy TripleO Quickstart on baremetal. The undercloud will still be virtualized, but controller and compute will be deployed on baremetal.
This post belongs to a serie. In order to get more knowledge about TOAD and tripleo-quickstart, please read http://teknoarticles.blogspot.com/2017/02/automated-osp-deployments-with-tripleo.html and http://teknoarticles.blogspot.com/2017/02/describing-cira-continuous-integration.html

Requirements

Hardware

  • A baremetal server is needed to act as Jenkins slave + contain virtualized undercloud. A multi-core CPU, 16GB of RAM and 60GB of disk is the recommended setup.
  • One server for each controller/compute that needs to be deployed. They need to have at least 8GB of RAM.

Network

  • IPMI access is needed for each controller/compute server
  •  A provisioning network is required, the jenkins slave and the controller/compute nodes need to be on the same network, and the defined CIDR cannot be used for any other deployments. The network also can't contain a gateway because the undercloud would be used for that.
  • The MAC address for each baremetal server needs to be identified to be passed in the instackenv.json file
  • External NIC access is needed in the jenkins slave, to be able to control it externally, without interfering with the provisioning actions
  • If choosing to deploy with isolated network (recommended), you will need independent NICs for admin/storage/external, etc... or have tagged vlans for all of them. You will need tagged vlans for: internal, storage, storage management, tenant and external. You need to be sure that the external tagged vlan has the right connectivity.
  • Provision network needs to be on an isolated NIC, cannot be using tagged vlans for that, due to the limitations of PXE boot.

How to define lab environments.

TOAD project provides jobs already created for baremetal deployment. When using https://github.com/redhat-nfvpe/jenkins-jobs , this will be expanded into Jenkins and will generate baremetal deployments for Newton and OSP:


See the job naming for oooq-newton-deploy-baremetal-toad-toad , it follows the schema of oooq-<<release>>-deploy-baremetal-<<slave_label>>-<<environment>>. The first "toad" refers to the jenkins slave label where the job will be run, and the second "toad" refers to the lab environment config that will be applied.

By default, the jobs are created pointing to https://github.com/redhat-nfvpe/toad_envs , a repository that contains a sample environment, that will illustrate how to create your own.
You will need to create your own repository for environments, as a pre-requisite for baremetal deployments with TOAD. When installing TOAD for the first time, you need to define the following ansible vars:

jenkins_job_baremetal_env_git_src: https://github.com/redhat-nfvpe/toad_envs.git
jenkins_job_baremetal_env_path: ''


The first setting is to define the git repo where you will create your environments. The second one can be used to point to a relative path on your repo (in the case you are reusing some existing project, if it's an independent project it will be just blank).

How an environment looks like

On TOAD you can define multiple environments, in case you want to execute baremetal deployments under several lab configurations. By default only "toad" environment will be defined, but you can create more job combinations to add extra environments.

The toad_envs repository needs to follow that schema:
|
|---- <<name of the environment>>
   |
   |---- deploy_config.yml
   |---- env_settings.yml
   |---- instackenv.json
   |---- net_environment.yml

So for each env you need to define that files. Let's explain their usage and how they look like

instackenv.json 

 Contains the credentials to access the servers, and defines the provisioning MAC address for each of them. It has the following format:
{
  "nodes": [
    {
      "mac": [ "<<provisioning_mac_address_here>>" ],
      "cpu": "<<number_of_cores>>",
      "memory": "<<memory_in_mb>>",
      "disk": "<<disk_in_gb>>",
      "arch": "x86_64",
      "pm_type": "pxe_ipmitool",
      "pm_user": "<<ipmi_user>>",
      "pm_password": "<<ipmi_password>>",
      "pm_addr": "<<ipmi_address>>"
    }
  ]
}


You need to add the proper entry for each baremetal server you want to manage.

deploy_config.yml

This just contains the extra parameters that need to be passed to the TripleO heat templates, depending on the type of deployment you need to perform. That's the one defined by default: 

extra_args: " --control-flavor baremetal --compute-flavor baremetal -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e ~/network-environment.yaml --ntp-server <<your_ntp_server_here>>"

Where you just will need to replace the ntp server with the one in your system. For HA deployments, Ceph, etc... you just will define the right settings and include the proper templates, following TripleO documentation.
In our case we defined the deployment to work with network-isolation, and using single-nic-vlans. You willl need to include the templates you need if you want to go without isolation, or with independent nics instead of vlans.

env_settings.yml

This file contains ansible vars that will be passed to TripleO quickstart, that will be required to perform the baremetal deployment. The default file looks like:

environment_type: toad <- replace with your own environment
hw_env: toad <- same
undercloud_network_cidr: 172.31.255.0/24 <- CIDR for the provisioning network
undercloud_local_ip: 172.31.255.1/24 <- IP that the undercloud will take on that network

undercloud_network_gateway: 172.31.255.1 <- same as undercloud, without mask
undercloud_undercloud_public_vip: 172.31.255.2 <- for HA deploys
undercloud_undercloud_admin_vip: 172.31.255.3 <- for HA deploys
undercloud_local_interface: eth1 <- default, as will be a VM and eth0 is the libvirt bridge
undercloud_masquerade_network: 172.31.255.0/24 <- same as provisioning
undercloud_dhcp_start: 172.31.255.105 <- ip address that baremetal servers will get, set to an unused range
undercloud_dhcp_end: 172.31.255.124 <- set to an unused range
undercloud_inspection_iprange: 172.31.255.200,172.31.255.220 <- temporary ranges for inspection, set to an unused range
virthost_provisioning_interface: <<nic_for_provision_on_the_virthost>> <- set this to the physical interface on your virthost, where the provisioning network will be created
virthost_provisioning_ip: 172.31.255.125 <- ip in the provision network that will be created on the virthost
virthost_provisioning_netmask: 255.255.255.0
virthost_provisioning_hwaddr: <<mac_address_for_provision_on_the_virthost>> <- mac address for the nic that will be used in the provision network
virthost_ext_provision_interface: <<external_nic_on_the_virthost>> <- nic fo the external network of the virthost, network on that NIC needs to be configured manually
undercloud_instackenv_template: "{{ jenkins_workspace }}/<<name_of_your_envs_repo>>/{{ hw_env }}/instackenv.json" <- path to the instackenv.json for your deploy, replace the repo name properly
overcloud_nodes: <- just empty, to take defaults
undercloud_type: virtual <- could the physical or ovb as well
step_introspect: true <- wheter to execute the introspection steps on the baremetal servers. Recommended to true
instrospect: true <- same
network_environment_file: "{{ jenkins_workspace }}/<<name_of_your_repo>>/{{ hw_env }}/net_environment.yml" <- same as undercloud_instackenv_template, but for defining the network environment
network_isolation_type: single_nic_vlans <- can take other TripleO values, but needs to match with the specified on deploy_config.yml
network_isolation: true <- same, needs to match with deploy_config.yml
enable_vbmc: false <- not used in our deployments
external_network_cidr: 192.168.23.0/24 <- set this to the range of your external network, can be anything
networks: <- defines the networks that will be created on the virthost   - name: external
    bridge: brext

    forward_mode: nat
    address: "{{ external_network_cidr|nthhost(1) }}"
    netmask: "{{ external_network_cidr|ipaddr('netmask') }}"
    dhcp_range:
      - "{{ external_network_cidr|nthhost(10) }}"
      - "{{ external_network_cidr|nthhost(50) }}"
    nat_port_range:
      - 1024
      - 65535
  - name: overcloud
    bridge: brovc
    phys_int: <<nic_for_provision_on_the_virthost>>
undercloud_networks: {} <- defines the networks that will be created on the undercloud, nothing is needed because it inherits networks from the virthost
network_isolation_ipv4_cidr: <<external_ip_cidr>> <- set to the external range

undercloud_external_network_cidr: <<external_ip_cidr>> <- same
floating_ip_cidr: <<floating_ip_cidr>> <- set if you want to use floating ips, to an unused external range on the external network

floating_ip_start: <<floating_ip_start>> <- starting on an unused range on the external
floating_ip_end: <<floating_ip_end>>

external_network_gateway: <<external_network_gateway>> -> set to the real ip address for your router

net_environment.yml

This is just the environment definition on TripleO. The settings here need to matci with the ones defined on the env_settings.yml. A default file, used for network isolation with vlans, looks like:

parameter_defaults:
  InternalApiNetCidr: 172.17.1.0/24 <- range for the internal network, can be anything
  StorageNetCidr: 172.17.3.0/24 <- range for storage
  StorageMgmtNetCidr: 172.17.4.0/24 <- range for storage management
  TenantNetCidr: 172.17.2.0/24 <- range for tenant network
  ExternalNetCidr: <<external_ip_cidr>> <- real cidr of your external network, needs to match with the define on the env_settings.yml
  InternalApiAllocationPools: [{'start': '172.17.1.10', 'end': '172.17.1.200'}] -> range of IPs for the internal API, needs to match CIDR
  StorageAllocationPools: [{'start': '172.17.3.10', 'end': '172.17.3.200'}] -> range of IPS for the storage network
  StorageMgmtAllocationPools: [{'start': '172.17.4.10', 'end': '172.17.4.200'}] -> range of IPs for the storage management network
  TenantAllocationPools: [{'start': '172.17.2.10', 'end': '172.17.2.200'}] -> range of IPs for tenant network
  ExternalAllocationPools: [{'start': '<<external_ip_range_start>>', 'end': '<<external_ip_range_end>>
  ExternalInterfaceDefaultRoute: <<external_ip_gateway>> <- ip address of the gateway used for external network communication
  InternalApiNetworkVlanID: 2001 <- just a sample, set to the right vlan tag, needs to be configured on your switch

  StorageNetworkVlanID: 2002
  StorageMgmtNetworkVlanID: 2003
  ExternalNetworkVlanID: <<external_vlan_tag>> <- set to the vlan tag configured in your switch specifically for external access
  TenantNetworkVlanID: 2004
  NeutronExternalNetworkBridge: "" <- not needed in our sample, leave it blank
  ControlPlaneSubnetCidr: "24"
  ControlPlaneDefaultRoute: 172.31.255.1 <- set to undercloud ip
  EC2MetadataIp: 172.31.255.1 <- set to undercloud ip
  DnsServers: ["<<dns_server_1>>", "<<dns_server_2>>"] <- set to the dns servers for your environment

 Once the four files for the environment are configured properly, according to the configuration in switches, baremetal deployment will be possible.

Special considerations for baremetal deployment

  • A baremetal deployment on quickstart will work in the same way as the virtualized one, but with the deployment relying on the physical servers and network, so it's crucial that the information about instackenv.json and network config resembles to the reality.
  • Also there is the need that the virthost where the undercloud lives, has the same network configuration as the baremetal servers for the controller and compute, so as an extra step the playbook for baremetal will configure network settings properly on the virthost.
  • To perform these extra steps, the tripleo-quickstart playbook needs to be a different one (baremetal-full-deploy.yml), that will perform these tasks and call the default playbook for deployment after that:
bash ./quickstart.sh \
--working-dir /home/stack/quickstart \
--no-clone \
--bootstrap \
--teardown all \
--tags all \
--skip-tags overcloud-validate,teardown-provision \
--requirements requirements.txt \
--requirements quickstart-extras-requirements.txt \
--config ./config/general_config/minimal.yml \
--release osp10 \
-e jenkins_workspace=${WORKSPACE} \
--config ${WORKSPACE}/toad_envs/yolanda/deploy_config.yml \
--extra-vars @${WORKSPACE}/toad_envs/yolanda/env_settings.yml \
--playbook baremetal-full-deploy.yml \
$VIRTHOST 2>&1 | tee ${WORKSPACE}/job_${BUILD_ID}.log


  • Following there is the network configuration that is created after a TripleO quickstart on baremetal is deployed:

TripleO quickstart will take care of all the network creation automatically. The only network that needs to be created manually is the external network on the virthost
  • When doing baremetal deployments on servers with multiple disks, is important that the right disk is picked properly. To achieve that,  you can proceed in two ways: giving hints generally by size, or giving hints by nods individually. To give hints by size you will need to add these settings on your env_settings.yml:
step_root_device_size: true
disk_root_device_size: <<size_in_gb_of_your_hard_disk>>


To give hints by individual nodes, using any hint that is available on ironic, you need to add these settings:

step_root_device_hints: trueroot_device_hints:
    - ip: <<pm_addr>>
      key: <<string>>
      value: <<string>>

Where: pm_addr needs to match the ipmi address for the nodes defined in instackenv.json you want to provide device hints for; key needs to match any hint define for ironic; value matches the desired value for it.

Conclusion

A baremetal deployment in TripleO quickstart is easy in TOAD, with the only need of creating the environments properly. It will be as simple as following that steps:
  • Create the toad_envs repository on some git repo you own
  • Add as many entries as needed, to define all your environments. Fill the instackenv.json, deploy_config.yml, env_settings.yml and net_environment.yml with the right information for each environment
  • Before deploying TOAD, set the right config vars to point to your environment repository: jenkins_job_baremetal_env_git_src and jenkins_job_baremetal_env_git_path
  • TOAD will come with one baremetal job for the TOAD environment. If you want to name it differently or add more environments, update the jenkins_jobs properly.
  • Be sure that the network configuration on your switch is correct: IPMI access, single NIC for provision, external network in the virthost with a tagged vlan,  internal/tenant/storage/storage-mgmt tagged vlans created
  • Execute baremetal job to deploy on your servers, monitor them to check if they PXE boot properly, and monitor the deployment and validation of TripleO
  • If needed, provide root device hints to boot with the proper disk
  • Enjoy your deployment!

Comments

Popular posts from this blog

Enable UEFI PXE boot in Supermicro SYS-E200

When provisioning my Supermicro SYS-E200-8D machines (X10 motherboard), i had the need to enable UEFI boot mode, and provision through PXE. This may seem straightforward, but there is a set of BIOS settings that need to be changed in order to enable it. First thing is to enable EFI on LAN , and enable Network Stack. To do that, enter into BIOS > Advanced > PCIe/PCI/PnP configuration and check that your settings match the following: See that PCI-E have EFI firmware loaded. Same for Onboard LAN OPROM and Onboard Video OPROM. And UEFI Network stack is enabled , as well as IPv4 PXE/IPv6 PXE support. Next thing is to modify boot settings. The usual boot order for PXE is to first add hard disk and second PXE network . The PXE tools (for example Ironic) will set a temporary boot order for PXE (one time) to enable the boot from network, but then the reboot will be done from hard disk. So be sure that your boot order matches the following: See that the first order is hard d

Test API endpoint with netcat

Do you need a simple way to validate that an API endpoint is responsive, but you don't want to use curl? There is a simple way to validate the endpoint with nc, producing an output that can be redirected to a logfile and parsed later: URL=$1 PORT=$2 while true; do     RESULT=$(nc -vz $URL $PORT 2>&1)     DATE=$(date)     echo $DATE $RESULT     sleep 1 done You can all this script with the API URL as first parameter, and API port as the second. netcat will be accessing to that endpoint and will report the results, detecting when the API is down. We also can output the date to have a reference when failures are detected. The produced output will be something like: vie jun 26 08:19:28 UTC 2020 Ncat: Version 7.70 ( https://nmap.org/ncat ) Ncat: Connected to 192.168.111.3:6443. Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds. vie jun 26 08:19:29 UTC 2020 Ncat: Version 7.70 ( https://nmap.org/ncat ) Ncat: Connected to 192.168.111.3:6443. Ncat: 0 bytes sent, 0 bytes

Create and restore external backups of virtual machines with libvirt

A common need for deployments in production, is to have the possibility of taking backups of your working virtual machines, and export them to some external storage. Although libvirt offers the possibility of taking snapshots and restore them, those snapshots are intended to be managed locally, and are lost when you destroy your virtual machines. There may be the need to just trash all your environment, and re-create the virtual machines from an external backup, so this article offers a procedure to achieve it. First step, create an external snapshot So the first step will be taking an snapshot from your running vm. The best way to take an isolated backup is using blockcopy virsh command. So, how to proceed? 1. First you need to extract all the disks that your vm has. This can be achieved with domblklist command:   DISK_NAME=$(virsh domblklist {{domain}} --details | grep 'disk' | awk '{print $3}') This will extract the name of the device that the vm is using