doc: make doc hierarchy more focused by holmanb · Pull Request #6694 · canonical/cloud-init

holmanb · 2026-01-29T00:13:09Z

The current hierarchy is quite "horizontal", which makes it difficult to navigate to common pages when you have to through lists of pages which are heavily contextual.

The basic principle that underscores this organizational change is that any configuration that can be provided at runtime is considered to be for general users. Similar content is organized under specific pages to make navigation easier. Any configuration that requires image modification is organized under an "advanced" section. Implementation details are redacted from general pages and in some cases moved to development locations.

Commit message:

doc: make doc hierarchy more focused

Various pages in the table of contents documented things that are not
useful without some other context. Remove these from the table of
contents and link to them from pages that have that context.

Introduce new "advanced" pages under the reference and explanation
categories to link to pages which are not suitable for the average user.

Other pages contained implementation details. Remove the implementation
details or move the page to be under the new "advanced" pages.

Introduce a new "project status" page to gather project-related info.

Delete content and pages containing duplicate information.

Don't make security policy recommendations to users.

Remove page documenting the performance analysis subcommand, since there
are more accurate ways of analyzing performance.

File deletions
--------------

about-cloud-config.rst
performance_analysis.rst

File renames
------------

faq.rst -> from reference to explanation
user_files.rst -> from reference to explanation
module_run_frequency -> from how-to to reference
test_unreleased_packages -> from howto to reference
foramt.rst -> format/index.rst (and sub-pages)

Note: Attempts were made to avoid modifying URLs, but some re categorizations were necessary.

Various pages in the table of contents documented things that are not useful without some other context. Remove these from the table of contents and link to them from pages that have that context. Introduce new "advanced" pages under the reference and explanation categories to link to pages which are not suitable for the average user. Other pages contained implementation details. Remove the implementation details or move the page to be under the new "advanced" pages. Introduce a new "project status" page to gather project-related info. Delete content and pages containing duplicate information. Don't make security policy recommendations to users. Remove page documenting the performance analysis subcommand, since there are more accurate ways of analyzing performance.

holmanb · 2026-02-02T22:26:51Z

The following screenshots show the table of contents introduced by this change.

How-to:

Reference:

Reference / Advanced:

Explanation:

Explanation / Advanced:

holmanb · 2026-02-02T22:27:44Z

 * :file:`status.json`:
  JSON file showing the datasource used, a breakdown of all four stages,
  whether any errors occurred, and the start and stop times of the stages.
+


These are implementation details about internal files that were moved from a user-facing page. Rework sections accordingly.

holmanb · 2026-02-02T22:30:03Z

@@ -1,141 +0,0 @@
-.. _about-cloud-config:


We have a page that introduces cloud-config, a ton of cloud-config examples, and a how-to which describes how to check cloud-config. I may have missed something on this page, but it seems like everything on this page either 1) already exists elsewhere in the docs or 2) isn't important.

holmanb · 2026-02-02T22:35:20Z

@@ -1,5 +1,8 @@
+:orphan:


These commands basically just generate reports based on log timestamps, which isn't accurate. It completely misses, for example, time spent importing Python libraries - which occurs before the logger logs anything. Other tools are more accurate and useful: and this command might be better off deprecated. For now, just remove it from the table of contents - it is linked to from elsewhere.

holmanb · 2026-02-02T22:41:00Z

 Base configuration
 ==================

-The base configuration format uses `YAML version 1.1`_, but may be


Jinja templating in the base config is fragile - keys that are required before getting configuration (it is not documented which ones those are) either cannot be without unexpected behavior or breakage (it requires knowing what keys are available before launch, and it requires the datasource to be accessed before keys can be read). While possible, this doesn't seem like a thing that users should be directed towards.

holmanb · 2026-02-02T22:41:50Z

- **Kernel command line**: On the kernel command line, anything found between
-  ``cc:`` and ``end_cc`` will be interpreted as cloud-config user-data.


Not something that we want to direct users towards. The original purpose - datasource_list - has a dedicated key. Undocument for now.

holmanb · 2026-02-02T22:42:40Z

+:orphan:
+


This documents internal behavior. It is linked to from places that make sense, but doesn't need to be in the table of contents.

holmanb · 2026-02-02T22:42:52Z


-Failure states
-==============
+Failure modes


Clearer language

holmanb · 2026-02-02T23:03:39Z

-SSH Host keys
-=============
-
-Cloud-init publishes the SSH host public keys generated to the serial console
-which can be validated prior to any SSH client connection to the launched VM.
-
-It provides assurance that you are connecting to the virtual machine you
-intended to launch, and not being intercepted by a man-in-the-middle (MITM)
-attack.


This is the exception to the above statement: it is specific to cloud-init. However, it isn't part of hardening an OS, so it didn't fit on this page. I can move this to another page, if desired.

holmanb · 2026-02-02T23:04:21Z

-Our explanatory and conceptual guides are written to provide a better
-understanding of how ``cloud-init`` works. They enable you to expand your
-knowledge and become better at using and configuring ``cloud-init``.


Short on substance.

holmanb · 2026-02-02T23:05:50Z

-.. toctree::
-   :maxdepth: 1
-   :hidden:
+:orphan:


This page is useful for anyone trying to use jinja templates, but it is confusing to land on without context. It is linked to from jinja-related content, so drop it from the table of contents.

holmanb · 2026-02-02T23:06:06Z

-   :hidden:
+:orphan:

-   kernel-command-line.rst


This hidden TOC seemed misplaced.

holmanb · 2026-02-02T23:06:49Z

 ``vendordata`` keys will see only redacted values.

 .. note::
-   To save time designing a user data template for a specific cloud's


Unnecessarily verbose

holmanb · 2026-02-02T23:07:17Z

-Storage locations
-----------------
-
-* :file:`/run/cloud-init/instance-data.json`: world-readable JSON containing
-  standardized keys, sensitive keys redacted.
-* :file:`/run/cloud-init/instance-data-sensitive.json`: root-readable
-  unredacted JSON blob.
-* :file:`/run/cloud-init/combined-cloud-config.json`: root-readable
-  unredacted JSON blob. Any meta-data, vendor-data and user-data overrides
-  are applied to the :file:`/run/cloud-init/combined-cloud-config.json` config
-  values.
-


These implementation details are moved to the internal_files.rst page.

holmanb · 2026-02-02T23:07:35Z

 .. note::
   ``merged_system_cfg`` represents only the merged config from the underlying
   filesystem. These values can be overridden by meta-data, vendor-data or
-   user-data. The fully merged cloud-config provided to a machine


Internal detail

holmanb · 2026-02-02T23:07:51Z

 Example Output
 --------------

-Below is an example of ``/run/cloud-init/instance-data-sensitive.json`` on an


Implementation detail.

holmanb · 2026-02-02T23:08:41Z

@@ -1,3 +1,5 @@
+:orphan:


This page isn't useful without more context. Link to it from places that make sense - and allow finding it via search - but don't include it in the table of contents.

holmanb · 2026-02-02T23:09:45Z

@@ -1,3 +1,5 @@
+:orphan:


This is linked from the breaking changes page, but doesn't need to be part of the table of contents. Users will find this page if they are trying to answer questions about return codes.

holmanb · 2026-02-02T23:11:57Z

-
-
-
-.. _configuration_files:
-
-Configuration files
-===================
-
-``Cloud-init`` configuration files are provided in two places:
-
- :file:`/etc/cloud/cloud.cfg`
- :file:`/etc/cloud/cloud.cfg.d/*.cfg`
-
-These files can define the modules that run during instance initialization,
-the datasources to evaluate on boot, as well as other settings.
-
-See the :ref:`configuration sources explanation<configuration>` and
-:ref:`configuration reference<base_config_reference>` pages for more details.


These are documented on the base configuration page.

holmanb · 2026-02-02T23:12:54Z

@@ -1,3 +1,5 @@
+:orphan:


This page makes more sense when linked from a different page than as a top-level item in the table of contents.

holmanb · 2026-02-02T23:14:25Z

-For more information about the file and how to construct it, see
-:ref:`our explanatory guide <about-cloud-config>` about the *user-data
-cloud-config* format.


cloud-config is a format, not a file. In the case here, there is a file that happens to have the cloud-config. Don't misconstrue separate ideas.

holmanb · 2026-02-02T23:16:36Z

-Instance data and lazy networks
-------------------------------
-
-One of the hallmarks of ``cloud-init`` is
-:ref:`its use of instance-data and JINJA queries <instancedata-Using>` -- the
-ability to write queries in user-data and vendor-data that reference runtime
-information present in :file:`/run/cloud-init/instance-data.json`. This works
-well when the meta-data provides all of the information up front, such as the
-network configuration. For systems that rely on DHCP, however, this
-information may not be available when the meta-data is persisted to disk.
-
-This datasource ensures that even if the instance is using DHCP to configure
-networking, the same details about the configured network are available in
-:file:`/run/cloud-init/instance-data.json` as if static networking was used.
-This information collected at runtime is easy to demonstrate by executing the
-datasource on the command line. From the root of this repository, run the
-following command:
-
-.. code-block:: bash
-
-   PYTHONPATH="$(pwd)" python3 cloudinit/sources/DataSourceVMware.py
-
-The above command will result in output similar to the below JSON:


This is a lot of words to say, basically:

"jinja templates sometimes behave one way on vmware, and sometimes a different way"

What makes these two ways different has to do with network bringup and when config is gathered, which is alluded to below.

Good cleanup.

holmanb · 2026-02-02T23:17:03Z

-Sometimes ``cloud-init`` may bring up the network, but it will not finish
-coming online before the datasource's ``setup`` function is called, resulting
-in a :file:`/var/run/cloud-init/instance-data.json` file that does not have the
-correct network information. It is possible to instruct the datasource to wait
-until an IPv4 or IPv6 address is available before writing the instance-data
-with the following meta-data properties:


These are implementation details that the user shouldn't have to care about.

add part handlers to advanced section relocate formats to reference move cloud-config under the formats page in table of contents

rewrite content in the configuration priority page to focus on higher priority first (more useful)

holmanb · 2026-02-03T21:10:37Z

 .B "-l, --long"
 Report extended cloud-id information as tab-delimited string

-.TP


Implementation detail

I think we should continue to include this parameter. This is part of a man page which should represent the options we accept and the defaults that the command assumes when the parameter is not provided. I don't agree that we should redact the parameter defaults. It's just documenting a manual for the script.

which should represent the options we accept and the defaults that the command assumes when the parameter is not provided

Why should the user be exposed to this information? They shouldn't need this flag.

Typical uses shouldn't, as a local developer outside of a cloud-init instance only came into play when debugging the initial cloud-it tool. I see your point.

holmanb · 2026-02-03T21:13:24Z

@@ -0,0 +1,46 @@
+.. _user_data_formats-cloud_boothook:


Content in files in this directory came from format.rst. This file had an overwhelming amount of content which made it difficult to find a specific piece of information.

I like the separation of each file here. We'll have to setup an number of redirects in readthedocs to handle specific section redirects here to the now separate pages, but I think it adds to clarity of each page.

We'll have to setup an number of redirects in readthedocs to handle specific section redirects here to the now separate pages

As long as we have /format -> /format/index, the subsection links will resolve to the format page, which itself links to the pages that contain the former subsections. I wouldn't bother with maintaining subsections as long as they new result is reasonable - which in this case it is.

holmanb · 2026-02-03T21:14:43Z

@@ -0,0 +1,70 @@
+.. _configuration:


Content on this page is based on the former explanation/configuration.rst file. Content was reworded - higher priority at the top, rather than at the bottom, and remove unnecessary content.

holmanb · 2026-02-03T21:15:36Z

@@ -1,405 +0,0 @@
-.. _user_data_formats:


Content from this file is now separated into more focused topics.

holmanb · 2026-02-03T21:20:02Z

@@ -0,0 +1,30 @@
+.. _user_data_formats-part_handler:


This is an advanced topic, so it was removed from the format section and added to the advanced page.

blackboxsw

still in progress. This looks really good to reorg. We'll get the proper redirects in place as this gets closer to landing.

blackboxsw · 2026-02-03T22:00:26Z

 .B "-l, --long"
 Report extended cloud-id information as tab-delimited string

-.TP


I think we should continue to include this parameter. This is part of a man page which should represent the options we accept and the defaults that the command assumes when the parameter is not provided. I don't agree that we should redact the parameter defaults. It's just documenting a manual for the script.

blackboxsw · 2026-02-03T22:04:06Z

@@ -0,0 +1,46 @@
+.. _user_data_formats-cloud_boothook:


I like the separation of each file here. We'll have to setup an number of redirects in readthedocs to handle specific section redirects here to the now separate pages, but I think it adds to clarity of each page.

blackboxsw · 2026-02-03T22:21:52Z

+   Validate user-data <debug_user_data.rst>
+   Debug cloud-init <debugging.rst>
   Check the status of cloud-init <status.rst>
+   Launch an instance using cloud-init <launching.rst>


What was the motivation of reordering Validate user-data and debug cloud-init as top item instead of Launching an instance?

Traffic analysis shows debug_user_data getting more hits than the launching page.

blackboxsw · 2026-02-03T22:22:58Z

-Instance data and lazy networks
-------------------------------
-
-One of the hallmarks of ``cloud-init`` is
-:ref:`its use of instance-data and JINJA queries <instancedata-Using>` -- the
-ability to write queries in user-data and vendor-data that reference runtime
-information present in :file:`/run/cloud-init/instance-data.json`. This works
-well when the meta-data provides all of the information up front, such as the
-network configuration. For systems that rely on DHCP, however, this
-information may not be available when the meta-data is persisted to disk.
-
-This datasource ensures that even if the instance is using DHCP to configure
-networking, the same details about the configured network are available in
-:file:`/run/cloud-init/instance-data.json` as if static networking was used.
-This information collected at runtime is easy to demonstrate by executing the
-datasource on the command line. From the root of this repository, run the
-following command:
-
-.. code-block:: bash
-
-   PYTHONPATH="$(pwd)" python3 cloudinit/sources/DataSourceVMware.py
-
-The above command will result in output similar to the below JSON:


Good cleanup.

blackboxsw · 2026-02-03T22:23:20Z

-Sometimes ``cloud-init`` may bring up the network, but it will not finish
-coming online before the datasource's ``setup`` function is called, resulting
-in a :file:`/var/run/cloud-init/instance-data.json` file that does not have the
-correct network information. It is possible to instruct the datasource to wait
-until an IPv4 or IPv6 address is available before writing the instance-data
-with the following meta-data properties:


holmanb · 2026-02-05T15:04:24Z

Thanks for the feedback @blackboxsw. I've applied the requested changes.

blackboxsw

Thank you @holmanb for the resolution of the minor nits. I found a few more in this pass, but other than these this overhaul in organization looks really good. I do like that we've retained some routes for moved/consolidated content which should help given breadcrumbs to old links without the need for specific redirects.

blackboxsw · 2026-02-05T19:51:18Z

 .B "-l, --long"
 Report extended cloud-id information as tab-delimited string

-.TP


Typical uses shouldn't, as a local developer outside of a cloud-init instance only came into play when debugging the initial cloud-it tool. I see your point.

blackboxsw · 2026-02-05T19:57:17Z

+- :ref:`User-data script <user_data_script>`
+- :ref:`Boothook <user_data_formats-cloud_boothook>`
+
+Formats that deal with other user-data formats:


This wording seems to repeat formats a bit too much. Can we think of another way to say this?

Suggested change

Formats that deal with other user-data formats:

Formats that provide additional user-data content:

I think it would actually be confusing to word it this way because the point of this categorization is that one format is embedded within another. How about:

Formats that embed other formats.

blackboxsw · 2026-02-05T20:10:52Z

-   together. If you would like to explore examples by operation or process
-   instead, refer to the :ref:`examples library <examples_library>`.
+
+   :ref:`This page <examples_library>` organizes examples by category.


Suggested change

:ref:`This page <examples_library>` organizes examples by category.

:ref:`The page <examples_library>` organizes examples by category.

I think it is better as is.

This page organizes examples by category.

vs

The page organizes examples by category.

I misread the rst decorations on that. The rendered content makes sense. Thanks for disregarding this.

Easy mistake - no worries :)

holmanb · 2026-02-05T20:39:27Z

Thanks for the feedback @blackboxsw. I think that I've addressed all of your comments. Please let me know if I missed anything.

blackboxsw

This looks really good. Thank you for the broad cleanup here.

holmanb · 2026-02-05T21:10:17Z

Thanks for the review @blackboxsw!

holmanb · 2026-02-05T21:13:29Z

Thanks for the reviews @blackboxsw!

Various pages in the table of contents documented things that are not useful without some other context. Remove these from the table of contents and link to them from pages that have that context. Introduce new "advanced" pages under the reference and explanation categories to link to pages which are not suitable for the average user. Other pages contained implementation details. Remove the implementation details or move the page to be under the new "advanced" pages. Introduce a new "project status" page to gather project-related info. Delete content and pages containing duplicate information. Don't make security policy recommendations to users. Remove page documenting the performance analysis subcommand, since there are more accurate ways of analyzing performance. File deletions -------------- about-cloud-config.rst performance_analysis.rst File renames ------------ faq.rst -> from reference to explanation user_files.rst -> from reference to explanation module_run_frequency -> from how-to to reference test_unreleased_packages -> from howto to reference foramt.rst -> format/index.rst (and sub-pages)

github-actions Bot added the documentation This Pull Request changes documentation label Jan 29, 2026

holmanb marked this pull request as draft January 29, 2026 00:13

holmanb force-pushed the holmanb/prioritize-frequent-docs branch from e44f9fd to 3b47a96 Compare January 29, 2026 00:31

holmanb assigned blackboxsw Jan 30, 2026

holmanb marked this pull request as ready for review January 30, 2026 17:55

holmanb commented Feb 2, 2026

View reviewed changes

Comment thread doc/rtd/explanation/hardening.rst

holmanb commented Feb 2, 2026

View reviewed changes

holmanb added 6 commits February 3, 2026 17:58

redact commands that the user shouldn't need

1b9a718

rewrite configuration explanation page

883f943

add part handlers to advanced section relocate formats to reference move cloud-config under the formats page in table of contents

fix / tweak TOC nesting

4a1990b

redact internal cli commands

72b28a6

wording, broken link

b756d95

split format page into multiple pages

6c165e0

rewrite content in the configuration priority page to focus on higher priority first (more useful)

holmanb commented Feb 3, 2026

View reviewed changes

move part handlers to explanation

902584e

holmanb commented Feb 3, 2026

View reviewed changes

holmanb added 2 commits February 3, 2026 21:30

simplify log file docs

598155f

fix build

f400e02

blackboxsw reviewed Feb 3, 2026

View reviewed changes

holmanb added 2 commits February 3, 2026 23:11

comments

fcf49b8

comments

1085174

holmanb requested a review from blackboxsw February 5, 2026 15:03

blackboxsw reviewed Feb 5, 2026

View reviewed changes

commentsp

3a66b8e

blackboxsw approved these changes Feb 5, 2026

View reviewed changes

holmanb merged commit 961fc84 into canonical:main Feb 5, 2026
22 checks passed

holmanb deleted the holmanb/prioritize-frequent-docs branch February 5, 2026 21:14

minaelee mentioned this pull request Feb 6, 2026

Pro refinements (stable-5.0) canonical/lxd#17575

Merged

victorlin mentioned this pull request Feb 6, 2026

[docs]: missing redirect #6730

Closed

		- Kernel command line: On the kernel command line, anything found between
		``cc:`` and ``end_cc`` will be interpreted as cloud-config user-data.

	Formats that deal with other user-data formats:
	Formats that provide additional user-data content:

	:ref:`This page <examples_library>` organizes examples by category.
	:ref:`The page <examples_library>` organizes examples by category.

		:orphan:

Conversation

holmanb commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

holmanb commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

holmanb Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

holmanb Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

holmanb commented Jan 29, 2026 •

edited

Loading

holmanb commented Feb 2, 2026 •

edited

Loading

holmanb Feb 2, 2026 •

edited

Loading

holmanb Feb 2, 2026 •

edited

Loading