doc: make doc hierarchy more focused#6694
Conversation
Various pages in the table of contents documented things that are not useful without some other context. Remove these from the table of contents and link to them from pages that have that context. Introduce new "advanced" pages under the reference and explanation categories to link to pages which are not suitable for the average user. Other pages contained implementation details. Remove the implementation details or move the page to be under the new "advanced" pages. Introduce a new "project status" page to gather project-related info. Delete content and pages containing duplicate information. Don't make security policy recommendations to users. Remove page documenting the performance analysis subcommand, since there are more accurate ways of analyzing performance.
e44f9fd to
3b47a96
Compare
| * :file:`status.json`: | ||
| JSON file showing the datasource used, a breakdown of all four stages, | ||
| whether any errors occurred, and the start and stop times of the stages. | ||
|
|
There was a problem hiding this comment.
These are implementation details about internal files that were moved from a user-facing page. Rework sections accordingly.
| @@ -1,141 +0,0 @@ | |||
| .. _about-cloud-config: | |||
There was a problem hiding this comment.
We have a page that introduces cloud-config, a ton of cloud-config examples, and a how-to which describes how to check cloud-config. I may have missed something on this page, but it seems like everything on this page either 1) already exists elsewhere in the docs or 2) isn't important.
| @@ -1,5 +1,8 @@ | |||
| :orphan: | |||
There was a problem hiding this comment.
These commands basically just generate reports based on log timestamps, which isn't accurate. It completely misses, for example, time spent importing Python libraries - which occurs before the logger logs anything. Other tools are more accurate and useful: and this command might be better off deprecated. For now, just remove it from the table of contents - it is linked to from elsewhere.
| Base configuration | ||
| ================== | ||
|
|
||
| The base configuration format uses `YAML version 1.1`_, but may be |
There was a problem hiding this comment.
Jinja templating in the base config is fragile - keys that are required before getting configuration (it is not documented which ones those are) either cannot be without unexpected behavior or breakage (it requires knowing what keys are available before launch, and it requires the datasource to be accessed before keys can be read). While possible, this doesn't seem like a thing that users should be directed towards.
| - **Kernel command line**: On the kernel command line, anything found between | ||
| ``cc:`` and ``end_cc`` will be interpreted as cloud-config user-data. |
There was a problem hiding this comment.
Not something that we want to direct users towards. The original purpose - datasource_list - has a dedicated key. Undocument for now.
| :orphan: | ||
|
|
There was a problem hiding this comment.
This documents internal behavior. It is linked to from places that make sense, but doesn't need to be in the table of contents.
|
|
||
| Failure states | ||
| ============== | ||
| Failure modes |
| SSH Host keys | ||
| ============= | ||
|
|
||
| Cloud-init publishes the SSH host public keys generated to the serial console | ||
| which can be validated prior to any SSH client connection to the launched VM. | ||
|
|
||
| It provides assurance that you are connecting to the virtual machine you | ||
| intended to launch, and not being intercepted by a man-in-the-middle (MITM) | ||
| attack. |
There was a problem hiding this comment.
This is the exception to the above statement: it is specific to cloud-init. However, it isn't part of hardening an OS, so it didn't fit on this page. I can move this to another page, if desired.
| Our explanatory and conceptual guides are written to provide a better | ||
| understanding of how ``cloud-init`` works. They enable you to expand your | ||
| knowledge and become better at using and configuring ``cloud-init``. |
| .. toctree:: | ||
| :maxdepth: 1 | ||
| :hidden: | ||
| :orphan: |
There was a problem hiding this comment.
This page is useful for anyone trying to use jinja templates, but it is confusing to land on without context. It is linked to from jinja-related content, so drop it from the table of contents.
| :hidden: | ||
| :orphan: | ||
|
|
||
| kernel-command-line.rst |
There was a problem hiding this comment.
This hidden TOC seemed misplaced.
| ``vendordata`` keys will see only redacted values. | ||
|
|
||
| .. note:: | ||
| To save time designing a user data template for a specific cloud's |
| Storage locations | ||
| ----------------- | ||
|
|
||
| * :file:`/run/cloud-init/instance-data.json`: world-readable JSON containing | ||
| standardized keys, sensitive keys redacted. | ||
| * :file:`/run/cloud-init/instance-data-sensitive.json`: root-readable | ||
| unredacted JSON blob. | ||
| * :file:`/run/cloud-init/combined-cloud-config.json`: root-readable | ||
| unredacted JSON blob. Any meta-data, vendor-data and user-data overrides | ||
| are applied to the :file:`/run/cloud-init/combined-cloud-config.json` config | ||
| values. | ||
|
|
There was a problem hiding this comment.
These implementation details are moved to the internal_files.rst page.
| .. note:: | ||
| ``merged_system_cfg`` represents only the merged config from the underlying | ||
| filesystem. These values can be overridden by meta-data, vendor-data or | ||
| user-data. The fully merged cloud-config provided to a machine |
| Example Output | ||
| -------------- | ||
|
|
||
| Below is an example of ``/run/cloud-init/instance-data-sensitive.json`` on an |
| @@ -1,3 +1,5 @@ | |||
| :orphan: | |||
There was a problem hiding this comment.
This page isn't useful without more context. Link to it from places that make sense - and allow finding it via search - but don't include it in the table of contents.
| @@ -1,3 +1,5 @@ | |||
| :orphan: | |||
There was a problem hiding this comment.
This is linked from the breaking changes page, but doesn't need to be part of the table of contents. Users will find this page if they are trying to answer questions about return codes.
|
|
||
|
|
||
|
|
||
| .. _configuration_files: | ||
|
|
||
| Configuration files | ||
| =================== | ||
|
|
||
| ``Cloud-init`` configuration files are provided in two places: | ||
|
|
||
| - :file:`/etc/cloud/cloud.cfg` | ||
| - :file:`/etc/cloud/cloud.cfg.d/*.cfg` | ||
|
|
||
| These files can define the modules that run during instance initialization, | ||
| the datasources to evaluate on boot, as well as other settings. | ||
|
|
||
| See the :ref:`configuration sources explanation<configuration>` and | ||
| :ref:`configuration reference<base_config_reference>` pages for more details. |
There was a problem hiding this comment.
These are documented on the base configuration page.
| @@ -1,3 +1,5 @@ | |||
| :orphan: | |||
There was a problem hiding this comment.
This page makes more sense when linked from a different page than as a top-level item in the table of contents.
| For more information about the file and how to construct it, see | ||
| :ref:`our explanatory guide <about-cloud-config>` about the *user-data | ||
| cloud-config* format. |
There was a problem hiding this comment.
cloud-config is a format, not a file. In the case here, there is a file that happens to have the cloud-config. Don't misconstrue separate ideas.
| Instance data and lazy networks | ||
| ------------------------------- | ||
|
|
||
| One of the hallmarks of ``cloud-init`` is | ||
| :ref:`its use of instance-data and JINJA queries <instancedata-Using>` -- the | ||
| ability to write queries in user-data and vendor-data that reference runtime | ||
| information present in :file:`/run/cloud-init/instance-data.json`. This works | ||
| well when the meta-data provides all of the information up front, such as the | ||
| network configuration. For systems that rely on DHCP, however, this | ||
| information may not be available when the meta-data is persisted to disk. | ||
|
|
||
| This datasource ensures that even if the instance is using DHCP to configure | ||
| networking, the same details about the configured network are available in | ||
| :file:`/run/cloud-init/instance-data.json` as if static networking was used. | ||
| This information collected at runtime is easy to demonstrate by executing the | ||
| datasource on the command line. From the root of this repository, run the | ||
| following command: | ||
|
|
||
| .. code-block:: bash | ||
|
|
||
| PYTHONPATH="$(pwd)" python3 cloudinit/sources/DataSourceVMware.py | ||
|
|
||
| The above command will result in output similar to the below JSON: |
There was a problem hiding this comment.
This is a lot of words to say, basically:
"jinja templates sometimes behave one way on vmware, and sometimes a different way"
What makes these two ways different has to do with network bringup and when config is gathered, which is alluded to below.
| Sometimes ``cloud-init`` may bring up the network, but it will not finish | ||
| coming online before the datasource's ``setup`` function is called, resulting | ||
| in a :file:`/var/run/cloud-init/instance-data.json` file that does not have the | ||
| correct network information. It is possible to instruct the datasource to wait | ||
| until an IPv4 or IPv6 address is available before writing the instance-data | ||
| with the following meta-data properties: |
There was a problem hiding this comment.
These are implementation details that the user shouldn't have to care about.
add part handlers to advanced section relocate formats to reference move cloud-config under the formats page in table of contents
rewrite content in the configuration priority page to focus on higher priority first (more useful)
| .B "-l, --long" | ||
| Report extended cloud-id information as tab-delimited string | ||
|
|
||
| .TP |
There was a problem hiding this comment.
I think we should continue to include this parameter. This is part of a man page which should represent the options we accept and the defaults that the command assumes when the parameter is not provided. I don't agree that we should redact the parameter defaults. It's just documenting a manual for the script.
There was a problem hiding this comment.
which should represent the options we accept and the defaults that the command assumes when the parameter is not provided
Why should the user be exposed to this information? They shouldn't need this flag.
There was a problem hiding this comment.
Typical uses shouldn't, as a local developer outside of a cloud-init instance only came into play when debugging the initial cloud-it tool. I see your point.
| @@ -0,0 +1,46 @@ | |||
| .. _user_data_formats-cloud_boothook: | |||
There was a problem hiding this comment.
Content in files in this directory came from format.rst. This file had an overwhelming amount of content which made it difficult to find a specific piece of information.
There was a problem hiding this comment.
I like the separation of each file here. We'll have to setup an number of redirects in readthedocs to handle specific section redirects here to the now separate pages, but I think it adds to clarity of each page.
There was a problem hiding this comment.
We'll have to setup an number of redirects in readthedocs to handle specific section redirects here to the now separate pages
As long as we have /format -> /format/index, the subsection links will resolve to the format page, which itself links to the pages that contain the former subsections. I wouldn't bother with maintaining subsections as long as they new result is reasonable - which in this case it is.
| @@ -0,0 +1,70 @@ | |||
| .. _configuration: | |||
There was a problem hiding this comment.
Content on this page is based on the former explanation/configuration.rst file. Content was reworded - higher priority at the top, rather than at the bottom, and remove unnecessary content.
| @@ -1,405 +0,0 @@ | |||
| .. _user_data_formats: | |||
There was a problem hiding this comment.
Content from this file is now separated into more focused topics.
| @@ -0,0 +1,30 @@ | |||
| .. _user_data_formats-part_handler: | |||
There was a problem hiding this comment.
This is an advanced topic, so it was removed from the format section and added to the advanced page.
blackboxsw
left a comment
There was a problem hiding this comment.
still in progress. This looks really good to reorg. We'll get the proper redirects in place as this gets closer to landing.
| .B "-l, --long" | ||
| Report extended cloud-id information as tab-delimited string | ||
|
|
||
| .TP |
There was a problem hiding this comment.
I think we should continue to include this parameter. This is part of a man page which should represent the options we accept and the defaults that the command assumes when the parameter is not provided. I don't agree that we should redact the parameter defaults. It's just documenting a manual for the script.
| @@ -0,0 +1,46 @@ | |||
| .. _user_data_formats-cloud_boothook: | |||
There was a problem hiding this comment.
I like the separation of each file here. We'll have to setup an number of redirects in readthedocs to handle specific section redirects here to the now separate pages, but I think it adds to clarity of each page.
| Validate user-data <debug_user_data.rst> | ||
| Debug cloud-init <debugging.rst> | ||
| Check the status of cloud-init <status.rst> | ||
| Launch an instance using cloud-init <launching.rst> |
There was a problem hiding this comment.
What was the motivation of reordering Validate user-data and debug cloud-init as top item instead of Launching an instance?
There was a problem hiding this comment.
Traffic analysis shows debug_user_data getting more hits than the launching page.
| Instance data and lazy networks | ||
| ------------------------------- | ||
|
|
||
| One of the hallmarks of ``cloud-init`` is | ||
| :ref:`its use of instance-data and JINJA queries <instancedata-Using>` -- the | ||
| ability to write queries in user-data and vendor-data that reference runtime | ||
| information present in :file:`/run/cloud-init/instance-data.json`. This works | ||
| well when the meta-data provides all of the information up front, such as the | ||
| network configuration. For systems that rely on DHCP, however, this | ||
| information may not be available when the meta-data is persisted to disk. | ||
|
|
||
| This datasource ensures that even if the instance is using DHCP to configure | ||
| networking, the same details about the configured network are available in | ||
| :file:`/run/cloud-init/instance-data.json` as if static networking was used. | ||
| This information collected at runtime is easy to demonstrate by executing the | ||
| datasource on the command line. From the root of this repository, run the | ||
| following command: | ||
|
|
||
| .. code-block:: bash | ||
|
|
||
| PYTHONPATH="$(pwd)" python3 cloudinit/sources/DataSourceVMware.py | ||
|
|
||
| The above command will result in output similar to the below JSON: |
| Sometimes ``cloud-init`` may bring up the network, but it will not finish | ||
| coming online before the datasource's ``setup`` function is called, resulting | ||
| in a :file:`/var/run/cloud-init/instance-data.json` file that does not have the | ||
| correct network information. It is possible to instruct the datasource to wait | ||
| until an IPv4 or IPv6 address is available before writing the instance-data | ||
| with the following meta-data properties: |
|
Thanks for the feedback @blackboxsw. I've applied the requested changes. |
blackboxsw
left a comment
There was a problem hiding this comment.
Thank you @holmanb for the resolution of the minor nits. I found a few more in this pass, but other than these this overhaul in organization looks really good. I do like that we've retained some routes for moved/consolidated content which should help given breadcrumbs to old links without the need for specific redirects.
| .B "-l, --long" | ||
| Report extended cloud-id information as tab-delimited string | ||
|
|
||
| .TP |
There was a problem hiding this comment.
Typical uses shouldn't, as a local developer outside of a cloud-init instance only came into play when debugging the initial cloud-it tool. I see your point.
| - :ref:`User-data script <user_data_script>` | ||
| - :ref:`Boothook <user_data_formats-cloud_boothook>` | ||
|
|
||
| Formats that deal with other user-data formats: |
There was a problem hiding this comment.
This wording seems to repeat formats a bit too much. Can we think of another way to say this?
| Formats that deal with other user-data formats: | |
| Formats that provide additional user-data content: |
There was a problem hiding this comment.
I think it would actually be confusing to word it this way because the point of this categorization is that one format is embedded within another. How about:
Formats that embed other formats.
| together. If you would like to explore examples by operation or process | ||
| instead, refer to the :ref:`examples library <examples_library>`. | ||
|
|
||
| :ref:`This page <examples_library>` organizes examples by category. |
There was a problem hiding this comment.
| :ref:`This page <examples_library>` organizes examples by category. | |
| :ref:`The page <examples_library>` organizes examples by category. |
There was a problem hiding this comment.
I misread the rst decorations on that. The rendered content makes sense. Thanks for disregarding this.
There was a problem hiding this comment.
Easy mistake - no worries :)
|
Thanks for the feedback @blackboxsw. I think that I've addressed all of your comments. Please let me know if I missed anything. |
blackboxsw
left a comment
There was a problem hiding this comment.
This looks really good. Thank you for the broad cleanup here.
|
Thanks for the review @blackboxsw! |
|
Thanks for the reviews @blackboxsw! |
Various pages in the table of contents documented things that are not useful without some other context. Remove these from the table of contents and link to them from pages that have that context. Introduce new "advanced" pages under the reference and explanation categories to link to pages which are not suitable for the average user. Other pages contained implementation details. Remove the implementation details or move the page to be under the new "advanced" pages. Introduce a new "project status" page to gather project-related info. Delete content and pages containing duplicate information. Don't make security policy recommendations to users. Remove page documenting the performance analysis subcommand, since there are more accurate ways of analyzing performance. File deletions -------------- about-cloud-config.rst performance_analysis.rst File renames ------------ faq.rst -> from reference to explanation user_files.rst -> from reference to explanation module_run_frequency -> from how-to to reference test_unreleased_packages -> from howto to reference foramt.rst -> format/index.rst (and sub-pages)





The current hierarchy is quite "horizontal", which makes it difficult to navigate to common pages when you have to through lists of pages which are heavily contextual.
The basic principle that underscores this organizational change is that any configuration that can be provided at runtime is considered to be for general users. Similar content is organized under specific pages to make navigation easier. Any configuration that requires image modification is organized under an "advanced" section. Implementation details are redacted from general pages and in some cases moved to development locations.
Commit message:
Note: Attempts were made to avoid modifying URLs, but some re categorizations were necessary.