AMI Build Guide#

本章详细介绍了作为一个开发者, 应该如何使用这个库来管理复杂的 packer build AMI 项目.

The `workflow` Folder#

Folder Structure

在这个 Repo 的根目录下有一个 workflow 目录. 里面包含一个 workflow_param.json 和一堆文件夹. 里面的目录结构大致长下面这样.

/workflow/
/workflow/{step1_workspace}/
/workflow/{step2_workspace}/
/workflow/.../
/workflow/find_root_base_image_id.py
/workflow/workflow_param.json

如果你还记得 Workflow and Step Strategy 中提到的我们将一个 AMI 的多个步骤拆分的策略, 整个 workflow 就是一个 workflow. 而这里的每个子目录就是一个 Step.

Step1

而 step1 是一个特殊的 Step. 它是这个 workflow 中的第一个 step, 同时它给出了一个典型的 step 的目录下的代码结构. 所有真正要用的 step 都是用这个 step1 作为模板来创建的.

Find root base image id script

通常一个 workflow 起始于一个 base image, 它被称为整个 workflow 中所有 step 的 root base image. find_root_base_image_id.py 是一个脚本筛选 base image 的. 在这个例子中, 我们筛选出指定 ubuntu 发行版中的最新版本作为 root base image. 获得了 image id 和 image name 之后我们就可以将其填入 workflow_param.json 文件中 (详情请看下一节).

find_root_base_image_id.py

# -*- coding: utf-8 -*-

"""
这个脚本能帮你找到合适的由 ubuntu 官方提供的 AWS AMI 作为 base image. 一旦找到之后,
就可以将 id 和 name 填入 ``workflow_param.json`` 文件中了.

你可以参考 ubuntu 官方的
`Find Ubuntu images on AWS <https://documentation.ubuntu.com/aws/en/latest/aws-how-to/instances/find-ubuntu-images/>`_
文档来了解具体方法. 本脚本只是自动化了这个方法的.

下面是我在 2024-06-13 从上面的文档中找到的一些信息, 我自己留个档:

The format for the parameter is:

    ubuntu/$PRODUCT/$RELEASE/stable/current/$ARCH/$VIRT_TYPE/$VOL_TYPE/ami-id

- PRODUCT: server, server-minimal or pro-server
- RELEASE: jammy, 22.04, focal, 20.04, bionic, 18.04, xenial, or 16.04
- ARCH: amd64 or arm64
- VIRT_TYPE: pv or hvm
- VOL_TYPE: ebs-gp3 (for >=23.10), ebs-gp2 (for <=23.04), ebs-io1, ebs-standard, or instance-store
"""

from pathlib_mate import Path
from boto_session_manager import BotoSesManager
from rich import print as rprint

import packer_ami_workflow.api as paw

# ------------------------------------------------------------------------------
# 根据 ubuntu 的版本, 以及 arch (AMD64 还是 ARM) 来找到合适的 root base ami
# 其中 owner_account_id 来自于本脚本最前面的 reference 文档
aws_profile = "bmt_app_dev_us_east_1"
root_base_ami_name = "ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"
root_base_ami_owner_account_id = "099720109477"
# ------------------------------------------------------------------------------
bsm = BotoSesManager(profile_name=aws_profile)

# locate the latest root base ami, this is my own implementation of the
# packer's source_ami_filter feature, for better control
images = paw.find_root_base_ami(
    ec2_client=bsm.ec2_client,
    source_ami_name=root_base_ami_name,
    source_ami_owner_account_id=root_base_ami_owner_account_id,
)
latest_root_base_ami = images[0]
ami_catalog_url = (
    f"https://{bsm.aws_region}.console.aws.amazon.com/ec2"
    f"/home?region={bsm.aws_region}#AMICatalog:"
)
print("Root base AMI details:")
rprint(latest_root_base_ami)
print(f"Root base AMI id = {latest_root_base_ami.id}")
print(f"Root base AMI name = {latest_root_base_ami.name}")
print(
    f"Enter the AMI id in ami catalog url to see the details "
    f"(in Community AMIs tab): {ami_catalog_url}"
)

Workflow Parameter JSON File

这些 Step 的 packer template 中都会有很多 parameter, 而这里很多 Step 的 parameter 都是一样的. 而 /workflow/workflow_param.json 就保存了这些通用的 parameter 的值.

Per Step Folder#

下面我们进到一个具体的 Step1 目录里看看每个 Step 的 packer template 应该怎么写. 下面列出了 Step 的 workspace 的目录结构.

# 核心文件
/workspace/
/workspace/templates/
/workspace/templates/.pkr.hcl
/workspace/templates/.pkrvars.hcl
/workspace/templates/.variables.pkr.hcl
/workspace/.gitignore
/workspace/packer_build.py # <--- 用这个脚本运行 packer build
/workspace/param.json

# 这个例子展示了在无需任何 python 库的情况下, 使用 python shell script 来实现 provision 逻辑
/workspace/zero_deps_script.py
# 这个例子展示了在需要少量 python 库的情况下, 使用 python shell script 来实现 provision 逻辑
/workspace/some_deps_script.py
# 这个例子展示了如何实现非常复杂的 provision 逻辑
# 其核心就是用 .sh 来给 Python 安装依赖
# 然后用 .py 来实现复杂的 provision 逻辑
/workspace/complicated_script.py
/workspace/complicated_script.sh
/workspace/README.rst

在详细展开之前, 我们先来了解一下 /workspace/templates/ 目录:

Prepare Packer Templates#

Packer 原生的 Template 本质上相当于一个 declaration (声明式) 的脚本. 这有点类似于 CloudFormation, 它不是面像过程, 而是声明式的. 但是它有着声明式脚本的通用缺点, 自动化程度不高, 参数化系统不够灵活, 你无法基于 parameter 来用 if else, for loop 等对整个 template 的结构进行控制. 所以我在 Template 上又用 jinja2 模板引擎封装了一层 (这跟我初期改进 CloudFormation 流程的做法类似). 具体来说整个开发流程是这样的:

用 jinja2 语言写 hcl 模板. 其中使用一个 params Python 对象作为所有的 parameter 的 container, 然后用 {{ params.parameter_name }} 这样的语法来插入参数. 所有的 jinja2 模板都放在 templates 目录下.
在 Python 脚本中生成 params 对象, 至于 params 的数据放在哪里由开发者自己决定. 一般是放在 JSON 里.
用 jinja2 语言 render 最终的 hcl 文件, 并将其放在 Step 的目录下.

其中在 #1 这一步, 我们有三个关键文件:

.pkr.hcl: packer template 的主脚本, 定义了 packer build 的逻辑.
.variables.pkr.hcl: packer variables 的声明文件. 注意这里只是定义, 而不包含 value. (see Input Variables and local variables for more information)
.pkrvars.hcl: packer variables 的值. packer build 的时候会从这里面读数据.

在编写 *.pkr.hcl 的时候, 所有在 packer template 中以 string replacement 存在的参数 (例如 ami_name = var.output_ami_name) 都需要在 *.variables.pkr.hcl 中定义. 这样能充分利用 packer 的 declaration 语法记录每个 variable 是用来干什么的. 请不要用 {{ param.output_ami_name }} 这样的语法直接替换掉里面的值, 这样做会降低代码的可维护性. 而如果是用来控制 template 结构的参数我们就不要放在 *.variables.pkr.hcl 中了. 我认为不应该用 jinja2 template 来完全替代 packer 的 variables 系统, 因为 jinja2 主要是一个 string template engine, 插入值的时候并不会检查类型, 所以我们只用 jinja2 来做 string manipulation, if/else, for loop.

下面我们给出了在 step1 中的这三个关键文件的源代码:

Important

.pkr.hcl 最为重要, 请仔细阅读其中的注释. 特别是里面关与如何用复杂的 Python 自动化脚本来执行 provision 的相关介绍.

.pkr.hcl

packer {
  required_plugins {
    amazon = {
      source  = "github.com/hashicorp/amazon"
      version = "~> 1"
    }
  }
}

source "amazon-ebs" "ubuntu20" {
  ami_name      = var.output_ami_name
  instance_type = "t2.micro"
  region        = var.aws_region
  ssh_username  = "ubuntu"

  /*----------------------------------------------------------------------------
  You can either explicitly specify the ``source_ami`` field or use the ``source_ami_filter``
  to find the AMI ID automatically. I personal prefer to provide the ``source_ami``
  explicitly for better control.
  https://developer.hashicorp.com/packer/integrations/hashicorp/amazon/latest/components/builder/ebs
  ----------------------------------------------------------------------------*/
  source_ami = var.source_ami_id

#  source_ami_filter {
#    filters = {
#      name                = var.source_ami_name
#      root-device-type    = "ebs"
#      virtualization-type = "hvm"
#    }
#    most_recent = true
#    owners      = [var.source_ami_owner_account_id]
#  }

  /*----------------------------------------------------------------------------
  If you want to build on a custom VPC, you can uncomment the following block
  ----------------------------------------------------------------------------*/
  # if none default VPC, you need to explicitly set this to true
  associate_public_ip_address = true

  # you don't have to set the VPC explicitly if you specified the subnet.
#  vpc_filter {
#    filters = {
#      "tag:Name": var.vpc_name,
#      "isDefault": var.is_default_vpc,
#    }
#  }

  # make sure you are using a public subnet
  subnet_filter {
    filters = {
      "tag:Name": var.subnet_name,
    }
    most_free = true
    random = false
  }

  # make sure the security group has ssh inbound rule
  security_group_filter {
    filters = {
      "tag:Name": var.security_group_name,
    }
  }

  /*----------------------------------------------------------------------------
  If you want to use a custom IAM role, you can use ``iam_instance_profile``
  ----------------------------------------------------------------------------*/
  iam_instance_profile = var.ec2_iam_role_name

  /*----------------------------------------------------------------------------
  If you need to add additional volume to your AMI, you can do it here
  ----------------------------------------------------------------------------*/
#  launch_block_device_mappings {
#      device_name = "/dev/sda1"
#      # in the most of the cases, you should set delete_on_termination = true
#      # the AMI has the snapshot of the volume already. When you use the output
#      # of this build as a image, it will create a ebs volume from the snapshot.
#      # If you set delete_on_termination = false, you will end up with a volume
#      # after the build and you have to clean up your self
#      delete_on_termination = true
#      /*
#      gp3 would be the optimal choice for most of the cases since Dec 2020
#
#      reference:
#
#      - Introducing new Amazon EBS general purpose volumes, gp3: https://aws.amazon.com/about-aws/whats-new/2020/12/introducing-new-amazon-ebs-general-purpose-volumes-gp3/
#      - Migrate your Amazon EBS volumes from gp2 to gp3 and save up to 20% on costs: https://aws.amazon.com/blogs/storage/migrate-your-amazon-ebs-volumes-from-gp2-to-gp3-and-save-up-to-20-on-costs/
#      */
#      volume_type = "gp3"
#      volume_size = 30
#  }
#
#  ami_block_device_mappings {
#     device_name = "/dev/sda1"
#     delete_on_termination = true
#     volume_type = "gp3"
#  }
}

build {
  name    = "install python"
  sources = [
    "source.amazon-ebs.ubuntu20"
  ]

  provisioner "shell" {
    inline = [
      "sleep 10",
      # verify ebs attachment
      "lsblk",
      "df -h",
    ]
  }


  /*----------------------------------------------------------------------------
  if you need to sudo install something, do it in the ``inline`` block
  ----------------------------------------------------------------------------*/
  provisioner "shell" {
    inline = [
      "sudo apt-get install -y curl",
      "sudo apt-get install -y wget",
      "sudo apt-get install -y git",
      "sudo apt-get install -y unzip",
    ]
  }


  /*----------------------------------------------------------------------------
  if you need to run complicate logic in Python, and it has zero dependency
  and fits in one file, you can upload the script to the server and run it.
  make sure you have the right shebang ``#!/usr/bin/env python`` in the first line
  ----------------------------------------------------------------------------*/
  provisioner "shell" {
    script = "zero_deps_script.py"
  }


  /*----------------------------------------------------------------------------
  if your Python script has some simple dependencies, please pre-configure
  the pyenv https://github.com/pyenv/pyenv and then use the pyenv to install
  some user Python versions, and then use the user Python to install dependencies
  and run code. DON't directly install anything to the system Python
  ----------------------------------------------------------------------------*/
#  provisioner "file" {
#    source = "requirements.txt"
#    destination = "/tmp/requirements.txt"
#  }
#
#  provisioner "file" {
#    source = "some_deps_script.py"
#    destination = "/tmp/some_deps_script.py"
#  }
#
#  provisioner "shell" {
#    inline = [
#      "~/.pyenv/shims/pip install -r /tmp/requirements.txt",
#      "~/.pyenv/shims/python /tmp/some_deps_script.py",
#    ]
#  }


  /*----------------------------------------------------------------------------
  if you need to run super complicate logic in Python, and you need split your code
  into modules and create a Python library for it, this is the solution.

  You should create a bash script to prepare the Python virtualenv,
  then explicitly use the virtualenv Python interpreter to run your scripts.
  Please read the sample ``complicated_script.sh`` for more details.
  ----------------------------------------------------------------------------*/
#  provisioner "shell" {
#    script = "complicated_script.sh"
#  }
}

Step Level Parameter#

和前面 workflow_param.json 类似, step_param.json 保存了跟这个 step 相关的一些参数. 其中最关键的就是这一步的 step id 和前一步的 step id. 如果当前 step 就是第一步, 那么 previous_step_id 就是 None.

Manage AMIs#

AWS 官方有很多 AMI API 可以进行 list, get details 等操作. 但是灵活性还是远远不如用数据库来管理 metadata. 所以在这个项目中我们会用 DynamoDB 来管理 AMI 的 metadata, 使得我们可以更方便地操作 AMI.

AmiData 是一个 ORM 类, 它能让开发者用 Pythonic 的方式操作 DynamoDB, 并封装了常用的 query pattern, 例如:

Packer Build Script#

Important

这一步就是我们真正作为一个 AMI 的维护着要动手写的部分了.

这个 packer_ami_workflow/tests/example.py 是一个非常薄的 wrapper, 把 packer_ami_workflow 库的 utility 扩展, 并封装了一下. 它展示了你如何扩展默认的 WorkflowParam 和 StepParam 类, 如何指定 AmiData DynamoDB Table 的名字.

有了这个 wrapper 之后, 开发者唯一要做的事情就只有三个:

编写 /workflow/step1/template 中的 packer template 的逻辑. 具体语法和细节你可以参考 packer 的官方文档.
填写 /workflow/workflow_param.json 和 /workflow/step1/step_param.json 配置文件.
运行 /workflow/step1/packer_build.py 脚本.

下面我们来详细讲一讲 packer_build.py 脚本的结构. 首先, 我们来看一下这个脚本的源码.

Important

packer_build.py 也是我们的核心脚本之一, 我建议仔细阅读 packer_build.py 源码中的注释来了解这个脚本的逻辑.

这个脚本的内容很简单:

创建一个 AmiBuilder 对象, 这个对象在前面提到的 packer_ami_workflow/tests/example.py wrapper 中已经写好了.

builder = AmiBuilder.make_builder(dir_step=dir_here)

用 packer build 命令创建 AMI.

# dry_run is True = NOTHING happen, False = run packer build
builder.run_packer_build_workflow(dry_run=True)

给 AMI 打 AWS Tags.

builder.tag_ami()

在 DynamoDB 中创建一条记录.

builder.create_dynamodb_item()

(Optional) 删除 AMI, 并可以选择是否同时删除 snapshot.

builder.delete_ami(delete_snapshot=False, skip_prompt=False)

还有一种特殊情况是, 这个 packer template 中有一些步骤真的无法通过自动化完成, 那么你可以手动用前一步的 AMI 创建 EC2, 然后 SSH 进去, 手动 provision 环境, 退出然后 stop instance, 手动 create image, 然后 terminate instance. (我这里有个小工具可以方便的 SSH 到 EC2 ssh2awsec2). 下面是一个例子:

# 手动填写这个 ec2 instance id
builder.create_image_manually(instance_id="i-a1b2c3d4")

AMI Build Guide#

The workflow Folder#

Per Step Folder#

Prepare Packer Templates#

Step Level Parameter#

Manage AMIs#

Packer Build Script#

The `workflow` Folder#