ARC-23: Sharing Application Information Source

Append application information to compiled TEAL applications

AuthorStéphane Barroso, Fabrice Benhamouda
Discussions-Tohttps://github.com/algorandfoundation/ARCs/issues/80
StatusFinal
TypeStandards Track
CategoryInterface
Created2023-01-11

Abstract

The following document introduces a convention for appending information (stored in various files) to the compiled application’s bytes. The goal of this convention is to standardize the process of verifying and adding this information. The encoded information byte string is arc23 followed by the IPFS CID v1 of a folder containing the files with the information.

The minimum required file is contract.json representing the contract metadata (as described in ARC-4), and as extended by future potential ARCs).

Specification

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC-2119.

Comments like this are non-normative.

Files containing Application Information

Application information are represented by various files in a folder that:

  • MUST contain a file contract.json representing the contract metadata (as described in ARC-4), and as extended by future potential ARCs).
  • MAY contain a file with the basename application followed by the extension of the high-level language the application is written in (e.g., application.py for PyTeal).

    To allow the verification of your contract, be sure to write the version used to compile the file after the import eg: from pyteal import * #pyteal==0.20.1

  • MAY contain the files approval.teal and clear.teal, that are the compiled versions of approval and clear program in TEAL.
    • Note that approval.teal will not be able to contain the application information as this would create circularity. If approval.teal is provided, it is assumed that the actual approval.teal that is deployed corresponds to approval.teal with the proper bytecblock (defined below) appended at the end.
  • MAY contain other files as defined by other ARCs.

CID, Pinning, and CAR of the Application Information

The CID allows to access the corresponding application information files using IPFS.

The CID MUST:

  • Represent a folder of files, even if only contract.json is present.

    You may need to use the option --wrap-with-directory of ipfs add

  • Be a version V1 CID

    E.g., use the option --cid-version=1 of ipfs add

  • Use SHA-256 hash algorithm

    E.g., use the option --hash=sha2-256 of ipfs add

Since the exact CID depends on the options provided when creating it and of the IPFS software version (if default options are used), for any production application, the folder of files SHOULD be published and pinned on IPFS.

All examples in this ARC assume the use of Kubo IPFS version 0.17.0 with default options apart those explicitly stated.

If the IPFS is not pinned, any production application SHOULD provide a Content Address Archiver (CAR)( file of the folder, obtained using ipfs dag export.

For public networks (e.g., MainNet, TestNet, BetaNet), block explorers and wallets (that support this ARC) SHOULD try to recover application information files from IPFS, and if not possible, SHOULD allow developers to upload a CAR file. If a CAR file is used, these tools MUST validate the CAR file matches the CID.

For development purposes, on private networks, the application information files MAY be instead provided as a .zip or .tar.gz containing at the root all the required files. Block explorers and wallets for private networks MAY allow uploading the application information as a .zip or .tar.gz. They still SHOULD validate the files.

The validation of .zip or .tar.gz files will work if the same version of the IPFS software is used with the same option. Since for development purposes, the same machine is normally used to code the dApp and run the block explorer/wallet, this is most likely not an issue. However, for production purposes, we cannot assume the same IPFS software is used and a CAR file is the best solution to ensure that the application information files will always be available and possible to validate.

Example: For the example stored in /asset/arc-0023/application_information, the CID is bafybeiavazvdva6uyxqudfsh57jbithx7r7juzvxhrylnhg22aeqau6wte, which can be obtained with the command:

ipfs add --cid-version=1 --hash=sha2-256 --recursive --quiet --wrap-with-directory --only-hash application_information

Associated Encoded Information Byte String

The (encoded) information byte string is arc23 concatenated to the 36 bytes of the binary CID.

The information byte string is always 41-byte long and always start, in hexadecimal with 0x6172633233 (corresponding to arc23).

Example: for the above CID bafybeiavazvdva6uyxqudfsh57jbithx7r7juzvxhrylnhg22aeqau6wte, the binary CID corresponds to the following hexadecimal value:

0x0170122015066a3a83d4c5e1419647efd2144cf7fc7e9a66b73c70b69cdad0090053d699

and hence the encoded information byte string has the following hexadecimal value:

0x61726332330170122015066a3a83d4c5e1419647efd2144cf7fc7e9a66b73c70b69cdad0090053d699

Inclusion of the Encoded Information Byte String in Programs

The encoded information byte string is included in the approval program of the application via a bytecblock with a unique byte string equal to the encoding information byte string.

For the example above, the bytecblock is:

bytecblock 0x61726332330170122015066a3a83d4c5e1419647efd2144cf7fc7e9a66b73c70b69cdad0090053d699

and when compiled this gives the following byte string (at least with TEAL v8 and before):

0x26012961726332330170122015066a3a83d4c5e1419647efd2144cf7fc7e9a66b73c70b69cdad0090053d699

The size of the compiled application plus the bytecblock MUST be, at most, the maximum size of a compiled application according to the latest consensus parameters supported by the compiler.

At least with TEAL v8 and before, appending the bytecblock to the end of the program should add exactly 44 bytes (1 byte for opcode bytecblock, 1 byte for 0x01 -the number of byte strings-, 1 byte for 0x29 the length of the encoded information byte string, 41 byte for the encodedin information byte string)

The bytecblock MAY be placed anywhere in the TEAL source code as long as it does not modify the semantic of the TEAL source code. However, if approval.teal is provided as an application information file, the bytecblock SHOULD be the last opcode of the deployed TEAL program.

Developers MUST check that, when adding the bytecblock to their program, semantic is not changed.

At least with TEAL v8 and before, adding a bytecblock opcode at the end of the approval program does not change the semantics of the program, as long as opcodes are correctly aligned, there is no jump after the last position (that would make the program fail without bytecblock), and there is enough space left to add the opcode, at least with TEAL v8 and before. However, though very unlikely, future versions of TEAL may not satisfy this property.

The bytecblock MUST NOT contain any additional byte string beyond the encoded information byte string.

For example, the following bytecblock is INVALID:

bytecblock 0x61726332330170122015066a3a83d4c5e1419647efd2144cf7fc7e9a66b73c70b69cdad0090053d699 0x42

Retrieval the Encoded Information Byte String and CID from Compiled TEAL Programs

For programs until TEAL v8, a way to find the encoded information byte string is to search for the prefix:

0x2601296172633233

which is then followed by the 36 bytes of the binary CID.

Indeed, this prefix is composed of:

  • 0x26, the bytecblock opcode
  • 0x01, the number of byte strings provided in the bytecblock
  • 0x29, the length of the encoded information byte string
  • 0x6172633233, the hexadecimal of arc23

Software retrieving the encoded information byte string SHOULD check the TEAL version and only perform retrieval for supported TEAL version. They also SHOULD gracefully handle false positives, that is when the above prefix is found multiple times. One solution is to allow multiple possible CID for a given compiled program.

Note that opcode encoding may change with the TEAL version (though this did not happen up to TEAL v8 at least). If the bytecblock opcode encoding changes, software that extract the encoded information byte string from compiled TEAL programs MUST be updated.

Rationale

By appending the IPFS CID of the folder containing information about the Application, any user with access to the blockchain could easily verify the Application and the ABI of the Application and interact with it.

Using IPFS has several advantages:

  • Allows automatic retrievel of the application information when pinned.
  • Allows easy archival using CAR.
  • Allows support of multiple files.

Reference Implementation

The following codes are not audited and are only here for information purposes.

Here is an example of a python script that can generate the hash and append it to the compiled application, according this ARC: main.py.

A Folder containing:

Files are accessible through followings IPFS command:

$ ipfs cat bafybeiavazvdva6uyxqudfsh57jbithx7r7juzvxhrylnhg22aeqau6wte/contract.json
$ ipfs cat bafybeiavazvdva6uyxqudfsh57jbithx7r7juzvxhrylnhg22aeqau6wte/application.py
If they are not accessible be sure to removed [–only-hash -n] from your command or check you ipfs node.

Security Considerations

CIDs are unique; however, related files MUST be checked to ensure that the application conforms. An arc-23 CID added at the end of an application is here to share information, not proof of anything. In particular, nothing ensures that a provided approval.teal matches the actual program on chain.

Copyright and related rights waived via CCO.

Citation

Please cite this document as:

Stéphane Barroso, Fabrice Benhamouda, "ARC-23: Sharing Application Information," Algorand Requests for Comments, no. 23, January 2023. [Online serial]. Available: https://github.com/algorandfoundation/ARCs/blob/main/ARCs/arc-0023.md.