基础设施即代码初探-开发Terraform Provider管理私有云MySQL实例

原创智汇云 360智汇云开发者

2024年12月19日 03:07

基础设施即代码(Infrastructure as Code, IaC)已经成为云时代DevOps实践中不可或缺的一环。通过代码来管理和配置基础设施,我们可以像开发软件一样,用工程化的方法来对待基础设施。在IaC领域,Terraform无疑是最流行的工具之一。

1
Terraform和Provider简介

Terraform是一个用于安全高效地构建、更改和版本控制基础设施的工具。它通过一种声明式的语言(HCL)来描述基础设施的期望状态,然后根据这些描述自动化地创建和管理资源。

本质来说，Terraform是一个状态管理工具，对其管理的资源执行CRUD操作，被其托管的资源很多时候是基于云的，但是也可以用其托管其他资源，理论上能通过CRUD表达的任意资源都可以通过其托管。

在 Terraform 中，Provider 是 Terraform 的核心组件之一，用于抽象化与特定云服务或其他基础设施资源的交互。它是一种插件，充当了 Terraform 和外部系统之间的桥梁，允许 Terraform 管理、创建、修改和删除外部资源。每个Provider负责一类特定的资源,例如AWS Provider允许我们管理EC2实例、S3存储桶等AWS资源。Terraform通过丰富的Provider生态,支持管理几乎所有主流的云资源。

Provider的主要功能

1.资源管理

定义可以创建、修改和删除的资源类型。

2.数据源查询

定义只读的数据源，用于从外部系统获取信息。

3.状态同步

Provider 通过 API 调用获取资源的当前状态，与 Terraform 的状态文件保持一致。

虽然Terraform内置了丰富的Provider支持，但在某些场景下，标准Provider可能无法满足业务需求。这时，开发一个自定义的Provider成为了解决问题的关键手段。

那么，什么情况下我们需要创建一个自定义的Provider呢？

支持自定义资源：如果你的基础设施中包含了一些自定义资源或服务（例如内部开发的私有云平台、专有 API 或者公司特定的工具），而这些资源并未被官方提供的 Terraform Providers 支持，那么开发一个自定义 Provider 就可以将这些资源纳入基础设施即代码（IaC）的管理中。
服务开放：如果你的服务未来要面向外部客户开放，一个自定义 Provider 可以作为基础设施自动化的重要组成部分，方便客户通过 Terraform 集成你的服务。
扩展现有Provider：如果现有的Provider 并未满足你的需求（例如，缺少某些资源类型的支持，或者对资源的操作不够灵活），通过自定义 Provider 可以对其进行扩展。

2
开发自定义Provider

在开始进入这个主题之前，我们先了解下Provider的整个工作流程：

可以看到，Provider就是连接Terraform和具体服务API的桥梁。如果我们想要实现一个管理私有云MySQL的Provider，其实调用的也是我们私有云自己的API，只是Terraform和Provider帮助了使用Terraform的用户摆脱了自己对接私有云API的繁琐步骤。

由此，我们也可以知道，如果我们要开发一个Provider，其实本质上就是完成对Terraform Provider的接口适配。

目前，HashiCorp提供了两个用于开发 Terraform Provider 的SDK，Terraform Plugin Framework 和 Terraform Plugin SDK ，Terraform Plugin Framework是 HashiCorp 官方推荐的新一代开发框架，设计更现代化，并且基于 Go Context 和 gRPC，强调扩展性和模块化，支持更细粒度的控制，提供更好的类型安全支持。所以本文将采用Terraform Plugin Framework来进行Provider的开发，也推荐所有新的Provider都采用官方的新框架。

1. 环境要求

1.Go 1.21+

2.Terraform v1.8+

3.自己的私有云API服务(或者任意你想对接的资源API都可以)

2. Provider的资源定义

任何Terraform Provider的主要目的都是为Terraform提供资源，资源主要有两种—resource（也可以称为托管资源）以及data sources（也可以称为数据源）。托管资源，通过实现创建、读取、更新和删除（CRUD）方法，支持完整的生命周期管理。而数据源则相对简单，仅实现了CRUD中的读取（Read）部分。当然，也有一种比较特殊的资源定义，也就是Provider本身。让我们用aws的配置来举例：

Provider 定义

provider 块用来配置与具体服务的交互方式。
常见的配置项包括认证信息、API 地址、默认区域等。


provider "aws" {  region = "us-east-1"  access_key = "your_access_key"  secret_key = "your_secret_key"}

Resource 定义(以aws s3举例)

resource 块用于定义由 Provider 管理的具体资源，这些资源可以进行全部的CRUD操作。


resource "aws_s3_bucket" "example_bucket" {  bucket = "my-example-bucket"  acl    = "private"
  tags = {    Name        = "My bucket"    Environment = "Dev"  }}

3.Data Sources定义(以aws s3举例)

data 块主要用来定义一些外部服务的现有资源，而不是再去创建新资源。


data "aws_s3_bucket" "existing_bucket" {  bucket = "existing-bucket-name"}

通过以上关于资源的定义，大概可以理出一个Provider的开发顺序，首先进行provider块部分相关的开发，然后进行resource/data块相关的开发。

2. provider结构设计

首先，我们看下官方的SDK中对于provider的接口定义：


type Provider interface {  // Metadata should return the metadata for the provider, such as  // a type name and version data.  //  // Implementing the MetadataResponse.TypeName will populate the  // datasource.MetadataRequest.ProviderTypeName and  // resource.MetadataRequest.ProviderTypeName fields automatically.  Metadata(context.Context, MetadataRequest, *MetadataResponse)
  // Schema should return the schema for this provider.  Schema(context.Context, SchemaRequest, *SchemaResponse)
  // Configure is called at the beginning of the provider lifecycle, when  // Terraform sends to the provider the values the user specified in the  // provider configuration block. These are supplied in the  // ConfigureProviderRequest argument.  // Values from provider configuration are often used to initialise an  // API client, which should be stored on the struct implementing the  // Provider interface.  Configure(context.Context, ConfigureRequest, *ConfigureResponse)
  // DataSources returns a slice of functions to instantiate each DataSource  // implementation.  //  // The data source type name is determined by the DataSource implementing  // the Metadata method. All data sources must have unique names.  DataSources(context.Context) []func() datasource.DataSource
  // Resources returns a slice of functions to instantiate each Resource  // implementation.  //  // The resource type name is determined by the Resource implementing  // the Metadata method. All resources must have unique names.  Resources(context.Context) []func() resource.Resource}

可以看到，如果我们需要实现这个接口，需要实现Metadata，Schema，Configure，DataSources，Resources这几个方法。

Metadata 方法用于提供当前 Provider 的元数据信息，例如类型名称（TypeName）和版本等。这些信息可以用于识别 Provider，或者在需要与 Terraform 核心交互时使用。

Schema 方法用于定义 Provider 的配置结构。例如，用户在 Terraform 中配置 Provider 的时候，可能需要指定 API 的凭证或目标地址。这些配置信息通过此方法定义。

Configure 方法用于初始化 Provider 的运行环境。通常会解析用户配置的参数（例如凭证或其他必要的初始化信息），并生成一个客户端实例或其他相关的资源。

DataSources 方法返回 Provider 支持的所有数据源类型。每个数据源用于从外部系统（如 API）中读取数据并将其提供给 Terraform。

Resources 方法返回 Provider 支持的所有托管资源类型。每个资源代表 Terraform 可以管理的一个实体，例如云服务中的虚拟机、数据库实例等。

第一步，先想下我们的Schema怎么实现。我们已经知道，Schema方法用于定义 Provider 的配置结构，因为本文的例子是要开发一个管理私有云MySQL的Provider，所以我们对接的API服务就是智汇云的OpenAPI。通过智汇云的OpenAPI文档，我们已知，如果要与其进行交互，需要Endpoint,AccessKeyId,AccessKeySecret这些信息，所以我们的Schema其实就是这些信息的一个结构化。

这个接口的定义我们放在provider.go这个文件中实现，以下是这个文件的部分代码。


// Ensure the implementation satisfies the expected interfaces.var (  _ provider.Provider = &ZyunDbProvider{})
// New is a helper function to simplify provider server and testing implementation.func New(version string) func() provider.Provider {  return func() provider.Provider {    return &ZyunDbProvider{      version: version,    }  }}
// ZyunDbProvider defines the provider implementation.type ZyunDbProvider struct {  // version is set to the provider version on release, "dev" when the  // provider is built and ran locally, and "test" when running acceptance  // testing.  version string}
// Metadata returns the provider type name.func (p *ZyunDbProvider) Metadata(ctx context.Context, req provider.MetadataRequest, resp *provider.MetadataResponse) {  resp.TypeName = "zyundb"  resp.Version = p.version}
// Schema defines the provider-level schema for configuration data.func (p *ZyunDbProvider) Schema(ctx context.Context, req provider.SchemaRequest, resp *provider.SchemaResponse) {  resp.Schema = schema.Schema{    Attributes: map[string]schema.Attribute{      "endpoint": schema.StringAttribute{        Description: "The endpoint of the ZyunDB API",        Required:    true,      },      "access_key_id": schema.StringAttribute{        Description: "The access key id of the ZyunDB API",        Required:    true,      },      "access_key_secret": schema.StringAttribute{        Description: "The access key secret of the ZyunDB API",        Required:    true,        Sensitive:   true,      },    },  }}
// Configure prepares a ZyunDB API client for data sources and resources.func (p *ZyunDbProvider) Configure(ctx context.Context, req provider.ConfigureRequest, resp *provider.ConfigureResponse) {
  // Retrieve provider data from configuration  var config zyundbProviderModel  diags := req.Config.Get(ctx, &config)  resp.Diagnostics.Append(diags...)  if resp.Diagnostics.HasError() {    return  }
  // If practitioner provided a configuration value for any of the  // attributes, it must be a known value.
  if config.Endpoint.IsUnknown() {    resp.Diagnostics.AddAttributeError(      path.Root("endpoint"),      "Unknown ZyunDB API Host",      "The provider cannot create the ZyunDB API client as there is an unknown configuration value for the ZyunDB API host. "+        "Either target apply the source of the value first, set the value statically in the configuration, or use the ZYUNDB_HOST environment variable.",    )  }        //...........................
  if resp.Diagnostics.HasError() {    return  }
  // Default values to environment variables, but override  // with Terraform configuration value if set.
  endpoint := os.Getenv("ZYUNDB_ENDPOINT")

  if !config.Endpoint.IsNull() {    endpoint = config.Endpoint.ValueString()  }
  // If any of the expected configurations are missing, return  // errors with provider-specific guidance.
  if endpoint == "" {    resp.Diagnostics.AddAttributeError(      path.Root("endpoint"),      "Missing ZyunDB API Endpoint",      "The provider cannot create the ZyunDB API client as there is a missing or empty value for the ZyunDB API endpoint. "+        "Set the endpoint value in the configuration or use the ZYUNDB_ENDPOINT environment variable. "+        "If either is already set, ensure the value is not empty.",    )  }
  //................................

  if resp.Diagnostics.HasError() {    return  }
  // Create a new ZyunDB client using the configuration values  client := client.NewZyunOpenApiClient(endpoint, accessKeyId, accessKeySecret, "v1", "https")  if client == nil {    resp.Diagnostics.AddError(      "Unable to Create ZyunDB API Client",      "An unexpected error occurred when creating the ZyunDB API client. "+        "If the error is not clear, please contact the provider developers.",    )    return  }
  // Make the ZyunDB client available during DataSource and Resource  // type Configure methods.  resp.DataSourceData = client  resp.ResourceData = client
  tflog.Info(ctx, "Configured ZyunDB client end")}
// Resources defines the resources implemented in the provider.func (p *ZyunDbProvider) Resources(ctx context.Context) []func() resource.Resource {  return []func() resource.Resource{    NewMysqlInstanceResource,  }}
// DataSources defines the data sources implemented in the provider.func (p *ZyunDbProvider) DataSources(_ context.Context) []func() datasource.DataSource {  return []func() datasource.DataSource{    NewMysqlInstanceDataSource,  }}

其中tflog是Terraform提供的一个日志库，帮忙我们打印日志。

Schema方法中对于每个字段都定义了Required: true是因为这些字段都是我们调用智汇云OpenAPI不可或缺的，而Sensitive: true则代表了这个字段是敏感字段，包含敏感信息，例如密码、密钥或其他不应暴露在日志或 Terraform 状态文件中的数据。

Configure方法，我们如果从配置文件中没有找到对应配置，就会从环境变量中进行寻找。通过找到的这些凭证信息，我们new了一个ZyunClient，这里因为我们是一个demo，所以整个client方法是自己在Provider的项目中实现的。一般如果是生产项目，更推荐将API部分的细节抽象成一个自己的SDK。

而Resources和DataSources方法，返回了后续我们需要定义的两种资源。

如上，我们的provider部分的结构就设计完成了，让我们继续下一步。

3. resource结构设计

首先，我们看下官方的SDK中对于resource的接口定义：


type Resource interface {  // Metadata should return the full name of the resource, such as  // examplecloud_thing.  Metadata(context.Context, MetadataRequest, *MetadataResponse)
  // Schema should return the schema for this resource.  Schema(context.Context, SchemaRequest, *SchemaResponse)
  // Create is called when the provider must create a new resource. Config  // and planned state values should be read from the  // CreateRequest and new state values set on the CreateResponse.  Create(context.Context, CreateRequest, *CreateResponse)
  // Read is called when the provider must read resource values in order  // to update state. Planned state values should be read from the  // ReadRequest and new state values set on the ReadResponse.  Read(context.Context, ReadRequest, *ReadResponse)
  // Update is called to update the state of the resource. Config, planned  // state, and prior state values should be read from the  // UpdateRequest and new state values set on the UpdateResponse.  Update(context.Context, UpdateRequest, *UpdateResponse)
  // Delete is called when the provider must delete the resource. Config  // values may be read from the DeleteRequest.  //  // If execution completes without error, the framework will automatically  // call DeleteResponse.State.RemoveResource(), so it can be omitted  // from provider logic.  Delete(context.Context, DeleteRequest, *DeleteResponse)}

可以看到，如果我们需要实现这个接口，需要实现Metadata，Schema，Create，Read，Update，Delete这几个方法。

Metadata 方法用于提供当前资源的元数据信息，标识资源的唯一名称（资源类型名）。

Schema定义了资源的所有属性及其类型。

Create,Read,Update,Delete就不用说了，对应了资源的增删改查。

看了这些，好像有个问题啊，我上面定义的provider的一些配置，能让我连接到API的那些配置，怎么取到呢？这就涉及到另一个接口—ResourceWithConfigure接口了，让我们看下这个接口的定义：


type ResourceWithConfigure interface {  Resource
  // Configure enables provider-level data or clients to be set in the  // provider-defined Resource type. It is separately executed for each  // ReadResource RPC.  Configure(context.Context, ConfigureRequest, *ConfigureResponse)}

实现这个接口就可以允许资源在初始化时接受来自 Provider 的全局配置（如认证信息、客户端实例、区域设置等）了。

好了，让我们开始开发这个resource的实现吧。

第一步，我们依然是需要思考下我们的Schema怎么实现。在API模式下，一个资源的各种操作对应的参数不一定是一致的，比如我们创建了一个自动分配端口的MySQL实例，创建的接口没传过port，但是获取数据的时候port就有了；我们创建的时候没有传过一个MySQL实例应该有几个从节点，但是扩容实例的时候却需要指明在哪个机房扩容几个节点。而对于IaC来说，因为代码就意味着你的资源，你的代码改了，那么你的资源就会更新，这种情况下我们就需要将所有的这些API的入参出参进行考虑，将整个Schema作为你的资源的整体。

我们先看下创建一个MySQL的入参：

再看下配置变更的入参，可以发现extendInfos完全是个额外的参数，根本与创建实例和下面的实例详情没有关系，但是我们却需要实现整体实例的节点扩容：

以及MySQL实例详情的返回值，根据如下返回值，我们其实可以想到，将master和slave的返回信息进行一定的处理就可以作为参数传给配置变更的API了：


{  "status" : 200,  "developer-message" : "",  "more-info" : "",  "errno-code" : 0,  "user-message" : "",  "data" : {    "id" : "xxx",    "port" : "xxx",    "status" : "1",    "ctime" : "2024-09-23 19:42:39",    "utime" : "2024-09-23 19:52:13",    "pkg_id" : "xxx",    "db_type" : "master-slave",    "is_audit" : "1",    "name" : "test_name",    "instance_type" : "EXCLUSIVE",    "network_id" : "xxx",    "subnet_id" : "xxx",    "idc" : [ "xxidc" ],    "rs_num" : [ {      "cnt" : "2",      "idc" : "xxidc"    } ],    "master" : [ {      "ip" : "1.1.1.1",      "idc" : "xxidc",      "type" : "master",      "idc_name" : "北京A区"    } ],    "slave" : [ {      "ip" : "2.2.2.2",      "idc" : "xxidc",      "type" : "slave",      "idc_name" : "北京A区"    } ],    "vip_data" : [ {      "port" : "xxx",      "idc" : "xxidc",      "vip" : "3.3.3.3",      "rw_status" : "6",      "idc_name" : "北京A区"    } ],  }}

根据以上信息，我们暂为Schema定义为如下结构：


type mysqlInstanceResourceModel struct {  ID           types.String   `tfsdk:"id"`  ProjectID    types.String   `tfsdk:"project_id"`  Port         types.String   `tfsdk:"port"`  Name         types.String   `tfsdk:"name"`  PkgID        types.String   `tfsdk:"pkg_id"`  InstanceType types.String   `tfsdk:"instance_type"`  Mode         types.String   `tfsdk:"mode"`  MasterIDC    types.String   `tfsdk:"master_idc"`  RedundantIDC types.String   `tfsdk:"redundant_idc"`  NetworkID    types.String   `tfsdk:"network_id"`  SubnetID     types.String   `tfsdk:"subnet_id"`  IsAuditLog   types.String   `tfsdk:"is_audit_log"`  Status       types.String   `tfsdk:"status"`  CTime        types.String   `tfsdk:"ctime"`  VipData      types.List     `tfsdk:"vip_data"`  MasterNum    types.List     `tfsdk:"master_num"`  SlaveNum     types.List     `tfsdk:"slave_num"`  Timeouts     timeouts.Value `tfsdk:"timeouts"`}

其中大部分参数我们都是根据创建实例方法来的，而VipData以及MasterNum和SlaveNum则主要方便后续给用户展示连接方式以及方便用户进行配置变更。

我们将这个resource的定义文件命名为mysql_instance_resource.go,部分代码如下：


// Ensure the implementation satisfies the expected interfaces.var (  _ resource.Resource                = &mysqlInstanceResource{}  _ resource.ResourceWithConfigure   = &mysqlInstanceResource{})

// NewMysqlInstanceResource is a helper function to simplify the provider implementation.func NewMysqlInstanceResource() resource.Resource {  return &mysqlInstanceResource{}}
// mysqlInstanceResource is the resource implementation.type mysqlInstanceResource struct {  client *client.ZyunOpenAPI}
// mysqlInstanceResourceModel maps the resource schema data.type mysqlInstanceResourceModel struct {// .........}
// Metadata returns the resource type name.func (r *mysqlInstanceResource) Metadata(_ context.Context, req resource.MetadataRequest, resp *resource.MetadataResponse) {  resp.TypeName = req.ProviderTypeName + "_mysql_instance"}
// Schema defines the schema for the resource.func (r *mysqlInstanceResource) Schema(ctx context.Context, _ resource.SchemaRequest, resp *resource.SchemaResponse) {  resp.Schema = schema.Schema{    Attributes: map[string]schema.Attribute{      "timeouts": timeouts.Attributes(ctx, timeouts.Opts{        Create: true,      }),      "id": schema.StringAttribute{        Description: "The id of the mysql instance",        Computed:    true,        PlanModifiers: []planmodifier.String{          stringplanmodifier.UseStateForUnknown(),        },      },      "port": schema.StringAttribute{        Description: "The port of the mysql instance",        Computed:    true,        PlanModifiers: []planmodifier.String{          stringplanmodifier.UseStateForUnknown(),                                        stringplanmodifier.RequiresReplace(),        },      },      "network_id": schema.StringAttribute{        Description: "The network id of the mysql instance",        Required:    true,      },      "vip_data": schema.ListNestedAttribute{        Description: "The vip data of the mysql instance",        Computed:    true,        NestedObject: schema.NestedAttributeObject{          Attributes: map[string]schema.Attribute{            "vip": schema.StringAttribute{              Description: "The vip of the vip",              Computed:    true,            },                                                //.....................          },        },      },                        //........................    },  }}
// Configure adds the provider configured client to the resource.func (r *mysqlInstanceResource) Configure(_ context.Context, req resource.ConfigureRequest, resp *resource.ConfigureResponse) {  // Add a nil check when handling ProviderData because Terraform  // sets that data after it calls the ConfigureProvider RPC.  if req.ProviderData == nil {    return  }
  client, ok := req.ProviderData.(*client.ZyunOpenAPI)
  if !ok {    resp.Diagnostics.AddError(      "Unexpected Data Source Configure Type",      fmt.Sprintf("Expected *client.ZyunOpenAPI, got: %T. Please report this issue to the provider developers.", req.ProviderData),    )
    return  }
  r.client = client}
// Create creates the resource and sets the initial Terraform state.func (r *mysqlInstanceResource) Create(ctx context.Context, req resource.CreateRequest, resp *resource.CreateResponse) {  var plan mysqlInstanceResourceModel  diags := req.Plan.Get(ctx, &plan)  resp.Diagnostics.Append(diags...)  if resp.Diagnostics.HasError() {    return  }
  // 获取超时上下文  createTimeout, diags := plan.Timeouts.Create(ctx, 20*time.Minute)  resp.Diagnostics.Append(diags...)  if resp.Diagnostics.HasError() {    return  }  ctx, cancel := context.WithTimeout(ctx, createTimeout)  defer cancel()
  // Generate API request body from plan  // Create new mysql instance
  result, err := r.client.CreateMySQLInstance(ctx, &client.CreateMySQLInstanceParams{    ProjectID:    plan.ProjectID.ValueString(),                //................  })  if err != nil {    resp.Diagnostics.AddError(      "Error creating mysql instance",      "Could not create mysql instance, unexpected error: "+err.Error(),    )    return  }
  instanceID := result.Detail.InsID  plan.ID = types.StringValue(instanceID)
  // Poll to check the instance status  ticker := time.NewTicker(10 * time.Second)  defer ticker.Stop()
CheckLoop:  for {    select {    case <-ctx.Done():      resp.Diagnostics.AddError(        "Timeout waiting for MySQL instance creation",        fmt.Sprintf("Instance %s creation did not complete within the timeout period", instanceID),      )      return    case <-ticker.C:      instance, err := r.client.GetMySQLInstance(ctx, plan.ProjectID.ValueString(), instanceID)      if err != nil {        continue      }
      // Status 1 means ready      if instance.Status == "1" {        plan.xx = xx                                //..........................        break CheckLoop      } else if instance.Status == "2" { // 2 means failed        resp.Diagnostics.AddError(          "Error creating mysql instance",          "MySQL instance creation failed",        )        return      }    }  }
  diags = resp.State.Set(ctx, plan)  resp.Diagnostics.Append(diags...)}
// Read refreshes the Terraform state with the latest data.func (r *mysqlInstanceResource) Read(ctx context.Context, req resource.ReadRequest, resp *resource.ReadResponse) {  // Get current state        //..............
  // Get refreshed order value from HashiCups  instance, err := r.client.GetMySQLInstance(ctx, state.ProjectID.ValueString(), state.ID.ValueString())  if err != nil {    resp.Diagnostics.AddError(      "Error Reading mysql instance",      "Could not read mysql instance ID "+state.ID.ValueString()+": "+err.Error(),    )    return  }        //................................
  // Set refreshed state  diags = resp.State.Set(ctx, &state)  resp.Diagnostics.Append(diags...)}
// Update updates the resource and sets the updated Terraform state on success.func (r *mysqlInstanceResource) Update(ctx context.Context, req resource.UpdateRequest, resp *resource.UpdateResponse) {        // 处理更新值到params................
  err := r.client.UpdateMySQLInstance(ctx, params)  if err != nil {                //............    return  }
  resp.Diagnostics.AddWarning(    "Asynchronous Operation",    "The MySQL instance update is an asynchronous operation. Use 'terraform refresh' to get the latest state after the update completes.",  )
  diags = resp.State.Set(ctx, plan)  resp.Diagnostics.Append(diags...)  if resp.Diagnostics.HasError() {    return  }}
// Delete deletes the resource and removes the Terraform state on success.func (r *mysqlInstanceResource) Delete(ctx context.Context, req resource.DeleteRequest, resp *resource.DeleteResponse) {  // Retrieve values from state  // ..........
  // Delete existing mysql instance  err := r.client.DeleteMySQLInstance(ctx, state.ProjectID.ValueString(), state.ID.ValueString())  if err != nil {    resp.Diagnostics.AddError(      "Error Deleting mysql instance",      "Could not delete mysql instance, unexpected error: "+err.Error(),    )    return  }}

以上代码中有几个需要注意的部分：

1.在schema的定义中，我们使用了Computed: true,和Optional: true这两个上文还未出现过的属性，其中Computed:true是表示此属性的值由提供者（Provider）计算并填充，而不是用户直接提供，也就是类似于自动生成的实例ID之类的。而Optional:true表示此属性是可选的，用户可以选择是否在 Terraform 配置文件中设置该值。

2.stringplanmodifier.UseStateForUnknown()以及stringplanmodifier.RequiresReplace()。

其中UseStateForUnknown表示当 Terraform 无法确定一个属性的新值（即该值是未知的，unknown），此修饰器会指示 Terraform 在计划阶段使用当前状态（state）中的值作为暂时的计划值。适用于在资源生命周期中，新值可能暂时不可用，但现有值可以作为替代。例如，某些属性的值需要依赖外部计算结果（如远程 API 的响应），但这结果在计划阶段尚未可知。

RequiresReplace表示如果该属性的值在配置中发生更改，则需要销毁并重新创建整个资源（触发替换操作）。这个使用场景还是比较常见的，因为很多我们创建时候支持的属性，可能在更新时候我们并不支持更新，那么设置了这个修饰器，Terraform就会自动为我们处理删除并重新创建的流程。

3.Create中的轮询处理

大部分情况下，我们的资源可能并不是瞬时创建完成，并且接口本身是个异步接口。这个时候可以进行轮询处理并指定超时时间，以便最终资源的属性可以回填。此处我们是通过ticker结合Terraform的超时上下文部分来处理的（注意schema的定义要有timeouts属性，而最终的超时时间可以让使用者在资源的使用配置中定义）：

  // 获取超时上下文  createTimeout, diags := plan.Timeouts.Create(ctx, 20*time.Minute)  resp.Diagnostics.Append(diags...)  if resp.Diagnostics.HasError() {    return  }  ctx, cancel := context.WithTimeout(ctx, createTimeout)  defer cancel()

4.Update的异步但不处理数据回填

此处我们的Update方法其实也是个异步方法，但是我们没有添加轮询来进行数据的回填，相对的，我们通过增加了一个Warning来告知用户自己去刷新数据：


  resp.Diagnostics.AddWarning(    "Asynchronous Operation",    "The MySQL instance update is an asynchronous operation. Use 'terraform refresh' to get the latest state after the update completes.",  )

以上就是resource部分的实现，可以看出，整体实现方式还是十分简便的，通过Get从Context中获取到配置文件的当前数据，再通过Set将从接口中拉取到的数据回填。

5. data结构设计

我们看下最后一种资源的结构设计,首先还是先看下官方的接口定义：


type DataSource interface {  // Metadata should return the full name of the data source, such as  // examplecloud_thing.  Metadata(context.Context, MetadataRequest, *MetadataResponse)
  // Schema should return the schema for this data source.  Schema(context.Context, SchemaRequest, *SchemaResponse)
  // Read is called when the provider must read data source values in  // order to update state. Config values should be read from the  // ReadRequest and new state values set on the ReadResponse.  Read(context.Context, ReadRequest, *ReadResponse)}

可以看到，整体接口跟之前都差不多，只不过只有最简单的Read方法需要实现了，其余生命周期是没有的。同样，我们也需要实现DataSourceWithConfigure好能够提取之前provider的配置。


type DataSourceWithConfigure interface {  DataSource
  // Configure enables provider-level data or clients to be set in the  // provider-defined DataSource type. It is separately executed for each  // ReadDataSource RPC.  Configure(context.Context, ConfigureRequest, *ConfigureResponse)}

这里的代码就不赘述了，唯一需要注意的部分是在resource的设计中，我们可以看到，比如VipData，我们定义的类型是types.List，所以我们需要在具体的CRUD中写更多的代码去处理type.List到其实际结构的处理。而其实，对于data类型的资源，我们就可以直接使用原始的go切片了。这是为什么呢？原因是resource类型的资源，我们在Create的流程中，想要去使用VipData这个数据，这个时候Terraform需要对其进行解析，如果是go的原始切片，就失去了Terraform对对应类型的一些隐藏处理，同时在执行terraform命令的时候就会报错。而data资源就没有这个顾虑了。

6. 入口文件

最后看一下入口文件的代码吧


var (  // these will be set by the goreleaser configuration  // to appropriate values for the compiled binary.  version string = "dev"
  // goreleaser can pass other information to the main package, such as the specific commit  // https://goreleaser.com/cookbooks/using-main.version/)
func main() {  var debug bool
  flag.BoolVar(&debug, "debug", false, "set to true to run the provider with support for debuggers like delve")  flag.Parse()
  opts := providerserver.ServeOpts{    // TODO: Update this string with the published name of your provider.    // Also update the tfplugindocs generate command to either remove the    // -provider-name flag or set its value to the updated provider name.    Address: "local/namespace/zyundb",    Debug:   debug,  }
  err := providerserver.Serve(context.Background(), provider.New(version), opts)
  if err != nil {    log.Fatal(err.Error())  }}

这里主要需要注意的是Address，如果是已经发布的provider，可以换成对应的域名以及对应的命名空间，因为我们这里是本地环境，所以暂时用local代替。

7.实现单元测试

当然，也别忘了书写单元测试，这里用一个比较简单的单元测试作为示例，具体可以看对应测试包的实现：


func TestAccMysqlInstanceDataSource(t *testing.T) {  resource.Test(t, resource.TestCase{    ProtoV6ProviderFactories: testAccProtoV6ProviderFactories,    Steps: []resource.TestStep{      // Read testing      {        Config: providerConfig + `data "zyundb_mysql_instance" "all" {    project_id = "xxxx"}`,        Check: resource.ComposeAggregateTestCheckFunc(          // Verify number of coffees returned          resource.TestCheckResourceAttrSet("data.zyundb_mysql_instance.all", "mysql_instance.#"),          // Verify the first coffee to ensure all attributes are set          resource.TestCheckResourceAttrSet("data.zyundb_mysql_instance.all", "mysql_instance.0.id"),          resource.TestCheckResourceAttrSet("data.zyundb_mysql_instance.all", "mysql_instance.0.name"),          resource.TestCheckResourceAttrSet("data.zyundb_mysql_instance.all", "mysql_instance.0.port"),        ),      },    },  })}

然后执行一下吧：

TF_LOG=ERROR TF_ACC=1 go test -count=1 -run='TestAccMysqlInstanceDataSource' -v

输出：


=== RUN   TestAccMysqlInstanceDataSource--- PASS: TestAccMysqlInstanceDataSource (2.66s)PASSok    terraform-provider-zyundb/internal/provider  3.574s

3
本地使用自己的Provider

既然我们已经实现了自己的Provider，那就来使用一下吧。首先我们先了解下Terraform的工作流：

其实整体来说就是先执行terraform init初始化环境，然后执行terraform plan看下terraform接下来会做什么改变，最后执行terraform apply来应用这个变更。

因为我们使用的是本地的Provider，所以我们首先需要先编辑Terraform CLI配置文件来让其能够发现我们本地的Provider，一般这个文件在~/.terraformrc，打开并修改它：


provider_installation {
  dev_overrides {      "local/namespace/zyundb" = "/Users/xxx/terraform-providers"  }
  # For all other providers, install them directly from their origin provider  # registries as normal. If you omit this, Terraform will _only_ use  # the dev_overrides block, and so no other providers will be available.  direct {}}

这里要注意local/namespace/zyundb部分要跟我们之前在入口文件里定义的保持一致，而/Users/xxx/terraform-providers就是我们最后用来放二进制文件的地方，这个目录是可以自己随便定义的，只要这个目录下有你最后的Provider的二进制文件。

编译

GOOS=darwin GOARCH=amd64 go build -o terraform-provider-zyundb_v1.0.0

然后将编译好的二进制文件放到上面的目录下。

配置并测试

创建一个main.tf,并编辑如下：


# Copyright (c) HashiCorp, Inc.
terraform {  required_providers {    zyundb = {      source = "local/namespace/zyundb"      version = "1.0.0"    }  }}
provider "zyundb" {  endpoint          = "你的endpoint"  access_key_id     = "你的access key"  access_key_secret = "你的access key secret"}
data "zyundb_mysql_instance" "all" {    project_id = "你的资源组ID"}output "mysql_instance" {  value = data.zyundb_mysql_instance.all}
resource "zyundb_mysql_instance" "example" {  name          = "example"  project_id    = "你的资源组ID"  pkg_id        = "套餐ID"  instance_type = "NORMAL"  mode          = "master-slave"  master_idc    = "xxidc"  redundant_idc = ""  network_id    = "xxx"  subnet_id     = "xxx"  is_audit_log  = true
  timeouts = {    create = "60m"  }}

然后，执行terraform init吗？不，如果是使用本地的terraform provider，请不要执行这一步，这是为什么呢？

因为terraform init 是用来初始化 Terraform 配置的，它通常会下载所需的远程提供者并初始化状态。可是，当你使用本地开发的提供者时，terraform init 并不会像往常那样从远程注册表下载提供者，因为本地提供者已经通过 dev_overrides 配置指定。因此，Terraform 不需要再运行 terraform init 来获取远程提供者。

如果你依然执行了 terraform init，它可能会尝试下载远程提供者，并且在你本地提供者存在的情况下，可能会引发一些错误或冲突。

比如:

所以，执行terraform plan吧

这个时候它就会列出会发生的变更。如果没有发现问题，那么执行terraform apply应用就可以了。

可以看到在apply的过程中我们的轮询日志也是会打印出来的：

当然，plan或者apply都有可能会失败，这个时候可以在命令前增加TF_LOG=debug/trace等就可以查看详细的报错信息了。

4
生成文档

如果是一个需要发布的Provider，文档还是很有必要的，而Terraform为我们提供了很方便的方式去生成文档。还记得我们代码中加的那些Description吗？那些就是生成文档的必须。

在已经添加了Description的前提下，我们在项目下再新建一个目录examples，在下面写下各种示例的tf文件，然后新建一个tools目录，新建tools.go，如下：


//go:build generate
package tools
import (  _ "github.com/hashicorp/copywrite"  _ "github.com/hashicorp/terraform-plugin-docs/cmd/tfplugindocs")
// Generate copyright headers//go:generate go run github.com/hashicorp/copywrite headers -d .. --config ../.copywrite.hcl
// Format Terraform code for use in documentation.// If you do not have Terraform installed, you can remove the formatting command, but it is suggested// to ensure the documentation is formatted properly.//go:generate terraform fmt -recursive ../examples/
// Generate documentation.//go:generate go run github.com/hashicorp/terraform-plugin-docs/cmd/tfplugindocs generate --provider-dir .. -provider-name zyundb

注意provider-name要是对应上的。

然后cd tools; go generate ./…

可以看到输出如下

然后，文档就生成了。

最后，看下整体的目录结构吧：

.├── README.md├── docs //这部分目录下都是自动生成的文档│   ├── data-sources│   │   └── mysql_instance.md│   ├── index.md│   └── resources│       └── mysql_instance.md├── examples //这个目录下是我们写的示例tf│   ├── README.md│   ├── data-sources│   │   └── zyundb│   │       └── data-source.tf│   ├── main.tf│   ├── provider│   │   └── provider.tf│   ├── resources│   │   └── zyundb│   │       ├── import.sh│   │       └── resource.tf│   └── terraform.tfstate //执行terraform apply后生成的状态文件├── go.mod├── go.sum├── internal│   ├── client //我们自己的处理API请求的代码，可以用SDK代替│   │   ├── client.go│   │   └── mysql.go│   └── provider //实现Provider的部分│       ├── mysql_instance_data_source.go│       ├── mysql_instance_data_source_test.go│       ├── mysql_instance_resource.go│       ├── mysql_instance_resource_test.go│       ├── provider.go│       └── provider_test.go├── main.go└── tools    ├── go.mod    ├── go.sum    └── tools.go