Internet Engineering Task Force R. Riccardo, Ed. Internet-Draft University of Udine Intended status: Informational July 12, 2013 Expires: January 13, 2014 A Format for Embedding Code in RFC draft-bernardini-embedded-code-00 Abstract Some RFC need to include some code or some kind of data like test vectors. Extracting the code or the data from the RFC "by hand" is possible, but tiresome and prone to errors, especially if the portion to be extracted is long or comprises many files. This document describes a format that can be used to embed code or data in an RFC, so that it can be extract by a software, while preserving the readibility of the embedded text. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on January 13, 2014. Copyright Notice Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of Riccardo Expires January 13, 2014 [Page 1] Internet-Draft Embedding Code July 2013 the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 2. Informal description . . . . . . . . . . . . . . . . . . . . . 3 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 4 4. Format description . . . . . . . . . . . . . . . . . . . . . . 6 5. Extractting and embedding . . . . . . . . . . . . . . . . . . 10 5.1. Extractor behaviour . . . . . . . . . . . . . . . . . . . 10 5.2. Embedder behaviour . . . . . . . . . . . . . . . . . . . . 11 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 8. Security Considerations . . . . . . . . . . . . . . . . . . . 12 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13 9.1. Normative References . . . . . . . . . . . . . . . . . . . 13 9.2. Informative References . . . . . . . . . . . . . . . . . . 13 Appendix A. Software . . . . . . . . . . . . . . . . . . . . . . 13 A.1. Bare-bone extractor . . . . . . . . . . . . . . . . . . . 14 A.2. Full implentation . . . . . . . . . . . . . . . . . . . . 16 A.2.1. Usage . . . . . . . . . . . . . . . . . . . . . . . . 16 A.2.1.1. Embedder . . . . . . . . . . . . . . . . . . . . . 16 A.2.1.2. Extractor . . . . . . . . . . . . . . . . . . . . 16 A.2.2. Sources . . . . . . . . . . . . . . . . . . . . . . . 16 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 51 Riccardo Expires January 13, 2014 [Page 2] Internet-Draft Embedding Code July 2013 1. Introduction Some RFC has the necessity of including code and/or data such as test vectors. Depending on the specific RFC, the embedded code can be quite long and be spread among a fair number of files. While it is possible to extract the embedded text "by hand", the procedure can be tiresome and prone to errors. Although some RFC adopted some ad hoc solution for this problem, this document tries to give a general solution by proposing a format that allows to embed in an RFC any line-oriented text file in a way that allows for automatic extraction, while preserving the readibility of the embedded text. 1.1. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 2. Informal description It is worth to begin with an informal example of the proposed format !--((BGN))--Ggroup-name F"example.txt" !. This is an example of embedded file !,1This is a very, very, very, very, very, very, very, very, very, !. very, very, very long line !. !. The line above is empty !.3This line ends with 3 spaces This line is not part of the content (and it is ignored) !, %09This line begins with a tab and contains %05%07 non-printable !. characters !--((END))-- It can be seen that o the embedded code is delimited by the two lines beginning with "!--((BGN))--" (header) and "!--((END))--" (trailer). o The header includes the name of the file (after the letter "F") and an optional group name marked by the letter "G". Riccardo Expires January 13, 2014 [Page 3] Internet-Draft Embedding Code July 2013 o Every line of the embedded content begins with a _line header_ "!," or "!.". If the line begins "!,", the embedded line is part of a line longer than 64 characters and that it has been broken over a "block" of more than one embedded lines. The last line of the block begins with "!.". o "!+" or "!=" can be followed by a single number that shows the number of spaces that must be added at the end of the embedded line (see, for example, the third line from bottom). If the number is absent, no spaces are present at the end of the line. o Printable ASCII characters (32..127) are inserted "as they are". Characters outside this interval are represented by an escape sequence "%xx". 3. Requirements This section briefly describes and justifies the requirements that were used to develop the proposed format REQ-1: The text to be embedded is most frequently represented by source code. The text is organized as a sequence of lines and most of the lines only contains printable ASCII characters (i.e., octets in the range 32..127). No constraint on the line lenght must be imposed. REQ-2: Lines are separated according to the local convention (i.e., with CR-LF on Windows and with LF on Unix). When a file is extracted, the line terminator used is determined by the local convention of the machine used for the extraction, independently on the convention used on the machine that produced the embedded text. Rationale: According to the model described by REQ-1 and REQ-2, the embedded file is not just a stream of octets, but a sequence of lines, each line being a sequence of octets. We feel that this "line-wise" approach is more natural when working with files that represent source code. REQ-3: A mechanism to include generic octets must be provide, but it is not necessary that the mechanism is efficient. Rationale: According to REQ-1, most of the text will be represented by printable characters; however, we cannot be sure that a non- printable character will never appaer in a source code (e.g., it could be an accented letter in a programmer's name in a comment), so an escaping mechanism must be provided. However, since we do Riccardo Expires January 13, 2014 [Page 4] Internet-Draft Embedding Code July 2013 not expect many non-printable characters, even an inefficient escaping mechanism (such as the "%xx" used in URL that triplicates the number of characters) can be used. REQ-4: Every embedded file must stores also its filename, optionally complete with a relative path. Absolute paths must not be allowed. Rationale: Even sliglthly complex software usually includes many files, typically organized in a tree. In order to re-create the original file organization, it is necessary to preserve the name of each single file and to include relative paths in the name. In this context absolute paths do not make much sense; moreover, since they could depend on the specific operative system, in our opinion it is better to avoid them. The alternative to the requirement of storing the path was to allow only "simple names" and store an organized collection of files as an archive. However, most of the current archive formats (e.g., tar, zip, rar) are binary-based and not suitable for embedding. There are few formats that are ASCII-based, but they are not common and their use would introduce a dependence on the archival software. We considered that it was simpler to allow for full path names in the embedded format. REQ-5: The pathname will be stored in an OS-independent manner. It will be the duty of the extractor to convert the stored pathname to local convention. A hierarchical file system structure can be assumed. Rationale: The necessity of getting rid of the differences in filename conventions between OSes is clear. The hypothesis that the destination filesystem is hierarchically organized, it seems to be not too rescricting (do non-hierarchical filesystem exists for systems that can be used to extract files from an RFC?) and it simplifies path handling. Moreover, if the target system was not hierarchically organized, how would we organize the fileset, say, "foo/src/bar.c", "foo/include/bar.h", "foo/man1/bar.1"? REQ-6: The embedding format must not depend on the amount of spaces on _both_ sides of the lines Rationale: This requirement was, actually, an afterthought. At the beginning I supposed that every embedded line would have its '!' in column 1 and that every character in the line was significant, ending spaces included. At the first test I discovered that adds, nevertheless, three spaces at the beginning of every line, so that the constraint of beginning on column 1 had to be removed. Since a constraint that required _exactly_ three Riccardo Expires January 13, 2014 [Page 5] Internet-Draft Embedding Code July 2013 spaces looked to frail to me, I decided that any amount of space could precede the initial '!'. The requirement that also spaces at the end of the line must be ignored was introduced when I discovered that, even when the embedded code was put inside a CDATA block, the spaces at the end of each line were removed. I did not investigate if this was due to XML specs or a characteristic of xml2rfc or something in my editor (jedit); I just decided that it would have been more robust to ignore the spaces at the end and put a "space counter" just after the line header. REQ-7: The embedding format must allow for non-embedded lines in the embedding section. Rationale: For example, if an embedding section spans more than one page, page headers and footers will be surrounded by embedded lines. REQ-8: A file extracted from an RFC must be a line-by-line verbatim copy of the original file, in the sense that the two files will have the same number of lines and that the sequence of octets in the n-th line in the original file will be equal to the sequence of octets in the n-th line in the extracted file (for every n, of course). Rationale: The necessity of having a lossless recovery is pretty obvious. The fact that "lossless recovery" is tested line by line, rather then character by character, aligns itself with the idea of considering the source code line-organized and makes lossless recovery possible even cross-architecture. REQ-9: The length of any embedded line (including header and spaces at the beginning) cannot be larger than 72. Rationale: This constrain is inherithed from the RFC format. 4. Format description In this section we describe formally the proposed format. o Every file to be embedded is converted into an _embedded text_ to be included in the target RFC (e.g., via a CDATA block). The embedded text is split into _embedded lines_ o Spaces at the beginning and at the end of an embedded line are not significant, so that the extractor can trim each line at both sides, if this makes its duty simpler. Riccardo Expires January 13, 2014 [Page 6] Internet-Draft Embedding Code July 2013 o The embedded text begins with the _header line_ "!--((BGN))--" and ends with the _trailer line_ "!--((END))--" o Header and trailer formats are described by the following grammar header-line = header-marker [group] filename header-marker = %x21.45.45.40.40.66.71.78.41.41.45.45 trailer-line = %x21.45.45.40.40.69.78.68.41.41.45.45 group = %x48 1*VCHAR SP ; G filename = %x4F delim1 path delim2 ; F delim1 = NO-SLASH delim2 = NO-SLASH path = [directory] simple-name directory = 1*(name "/") simple-name = name name = 1*(SP / NO-SLASH) VCHAR = %x21-7E NO-SLASH = %x21-46 / %x48-7E ; Visible char without '/' where delim1 MUST be equal to delim2 and MUST not be contained in path (it is not possible to specify this constraint with ABNF). delim1 and delim2 can be any printable character, but "/" since it is used as path separator. Rationale: In other words, the first char after the "F" acts as a delimiter for the filename. This "adaptive quotation" mechanism allows to quote every filename without any escaping mechanism. (OK, _almost_ all filenames: a filename that uses every printable character is not quotable) o The filename after the "F" in the header line is the path to be used to save the extracted files. The format used is Unix-like with "/" as separator; it will be the duty of the extractor to convert this convention to the local convention (e.g., "foo/src/ bar.c" will be transformed to "foo\src\bar.c" for Windows or "[.foo.src]bar.c" for VMS). The last component of the filename is considered the basename of the file, while the remaining part is the path, split into its components. This distinction between path and basename, although obvious, it is worth to be pointed out because some sistems (e.g., VMS) use a different syntax for the two components. o The optional group name after the "G" in the header line has no particular meaning for the embedding format, but it can be useful to help users to extract only the file of interest. Suppose, for example, that an RFC provides some code in C and some code in Python that implement the RFC. Typically, one user will be interested only in one version. In this case one can collect the Riccardo Expires January 13, 2014 [Page 7] Internet-Draft Embedding Code July 2013 C sources under the group "C" and the python sources under the group "python" and the user will be able to ask to the extracter to extract only the files belonging to a group. o Note that in the header line there is no space between the last "-" and the following letter (F or G) and that there is _exactly one space_ between the end of the group name and the letter F. o The embedded text is between the header and the trailer lines. Not every line between the header and the trailer is an embedded line, but only the lines that begin with "!," or "!." (after discarding any initial spaces, as explained above). There are three types of embedded lines: empty lines, full lines, partial lines. The corresponding ABNF grammar is as follows empty-line = full-marker full-line = full-marker padding [body] partial-line = partial-marker padding [body] full-marker = "!." partial-marker = "!," padding = SP / NON-ZERO / ALPHA / "|" / "@" / ":" body = *63(%x20-7E) %x21-7E ALPHA = %x41-5A / %x61-7A NON-ZERO = %x31-39 o As shown in the grammar, the body of an embedded line can be any non-empty sequence of at most 64 printable characters. Note that the body cannot end with a space, since trailing spaces are considered non-significant. The limit of 64 grunts us that the total line length (64 characters + 3 spaces for margin + 3 characters for marker and padding = 70) will not be larger than 72. o If a line is not empty, the inital marker (full or partial) is followed by a character that specifies how many white spaces are to be added at the end of body. The meaning of padding is shown in Table 1. This is necessary in order to handle lines that ends with spaces. As explained in the requirements section, this apparently involuted procedure (trim the spaces at the end and put them back later, using a character to store the padding length) is necessary because in some occasions ending spaces were lost. o If a content line is longer than 64 characters, it is split over several embedded lines. Every embedded line but the last will begin with the partial-marker, while the last one will begin with full-marker. Of course, if the line is shorter than 64 characters, only the full-line will be generated. Riccardo Expires January 13, 2014 [Page 8] Internet-Draft Embedding Code July 2013 Rationale: Markers "!." and "!," were chosen because they do not clutter too much the embedded text, making it easly readable. Also the choice of using a space rather than a 0 in order to mark an empty padding is due to the same reason. o Since only printable characters can be included in an RFC (and, therefore, in the body of an embedded line), an escaping mechanism must be provided for the occasional non-printing character. There are three different types of escape sequences, described in the following grammar normal-escape = "%" 2HEXDIGIT percent-escape = "%%" empty-escape = "%." When a string is unescaped the following substitution are made * In the case of a normal-escape sequence "%xy" the sequence is replaced by the octect whose value is equal to the hexadecimal number xy (e.g, %09 is replaced by a tab). Question: Should we allow this only for non-printable characters or also for the printable ones? * In the case of the percent-escape "%%" the sequence is replaced by a single % * In the case of the empty-escape "%." the sequence is just deleted. Rationale: This escape sequence could seem strange. It has been introduced so that it can be used to break sequences of characters that could be misinterpreted in some context. For example, if the embedded text is imported in an XML file enclosed in a CDATA block and the code to be embedded contains the string "]]>" (for example, if the code produces an XML output), the string "]]>" in the source code can be mistaken for the end of the CDATA block. By adding an empty-escape "%."after the "]]" the problem can be avoided. (In case you are wondering: yes, this happenend while developing the format). Riccardo Expires January 13, 2014 [Page 9] Internet-Draft Embedding Code July 2013 +-----------+--------+ | Character | Value | +-----------+--------+ | space | 0 | | 1..9 | 1..9 | | A..Z | 10..35 | | a..z | 36..61 | | @ | 62 | | | | 63 | | : | 64 | +-----------+--------+ Table 1: Meaning of the padding character 5. Extractting and embedding In this section we describe the expected behaviour of an extractor/ embedder by describing a possible way to extract/embed text. We want to emphasize here that we do not want to constraint the algorithm used in a specific extractor/embedder: as soon as the external behaviour is equal to the behaviour of the algorithms described here, the extractor/embedder is acceptable. When we say that "external behaviour is equal" it means that starting from the same correctly formatted input, the two algorithms will produce exactly the same output. If the input is not correctly formatted, we have an error condition. In the presence of an error condition the extractor can behave in any way: raise a warning, abort the execution or try to correct the error. 5.1. Extractor behaviour The expected behaviour of an extractor must be equivalent to the following 1. The extractor keeps an internal flag In_Embedding that is true when the extractor is processing an embedding section. The extractor also keep a string Buffer used to accumulate partial embedded lines 2. At the beginning the flag In_Embedding is set to false and the Buffer is empty. 3. For every line of the input Riccardo Expires January 13, 2014 [Page 10] Internet-Draft Embedding Code July 2013 1. Trim the spaces at the beginning and at the end of the line 2. If the line is a header-line and In_Embedding is false, set In_Embedding to true and start the next iteration of the loop. It is an error if the line is a header-line and In_Embedding is true. 3. If we are here, the line is not an header-line. If In_Embedding is false, start a new interation; otherwise check the line type 1. If the line is a full-line (including an empty-line), 1. Append the body of the line, suitably padded with the required number of spaces, to Buffer 2. Unescape the content of Buffer 3. Output the result of the unescape procedure 4. Empty the buffer and start a new iteration 2. If the line is a partial-line, append its content , suitably padded with the required number of spaces, to Buffer and start a new iteration. 3. If the line is a trailer-line, set In_Embedding to true and start a new iteration. It is an error if when we are here, Buffer is not empty since it should not be possible to have the embedded section terminating with a partial line. 4. Ignore any other (non-embedded) line 4. It is an error if at the end of the file In_Embedding is true. Note that the unescaping step is done after joining all the lines in a block and not line-by-line. This because it could happen that a partial line could end with a partial escape sequence, e.g., "%0". Although the embedder could be carefull and not split a line in the middle of an escape sequence, it is simpler to do the unescaping after the joining. 5.2. Embedder behaviour The behaviour required to the embedder is simpler. The embedder, for every line of the content Riccardo Expires January 13, 2014 [Page 11] Internet-Draft Embedding Code July 2013 1. Apply the escape procedure to the content line, replacing every non-printable character and every % with the corresponding escape sequence. Optionally, it can insert the empty escape sequence "%." where necessary. 2. Break the result of the escaping procedure in blocks of at most 64 characters. (Of course, if the line is smaller than 64 characters, a single block suffices.) Blocks can be smaller than 64 characters if the embedder decides to break the line in some special places like, for example, at whitespaces. (Breaking lines at a whitespace could make it easier to extract the content by hand.) 3. For every block obtained at the previous step 1. Remove the trailing spaces, keeping track of their number 2. Prepend the block with the correct marker (full-marker if the block is the last one, partial-marker for the others) followed by the character specifying the number of removed trailing spaces 3. Output the result of the previous step Of course, the sequence of content lines will be preceded by the header line and followed by the trailer line. 6. Acknowledgements 7. IANA Considerations This memo includes no request to IANA. 8. Security Considerations All drafts are required to have a security considerations section. See RFC 3552 [RFC3552] for a guide. TO BE WRITTEN 9. References Riccardo Expires January 13, 2014 [Page 12] Internet-Draft Embedding Code July 2013 9.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [min_ref] authSurName, authInitials., "Minimal Reference", 2006. 9.2. Informative References [I-D.narten-iana-considerations-rfc2434bis] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", draft-narten-iana-considerations-rfc2434bis-09 (work in progress), March 2008. [RFC2629] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629, June 1999. [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC Text on Security Considerations", BCP 72, RFC 3552, July 2003. Appendix A. Software In this appendix we include some software to extract/embed code from/ into RFCs. More precisely, we provide o A bare-bones, not fully conformant, Ruby script for extraction. The script is not fully conformant because 1. It does not handle some corner cases such as padding lengths larger than 9, lines ending with a '%', and so on... 2. It does not save the extracted files according the path specified with F, but it acts as a filter, writing to standard output (i) the header and trailer lines "as they are" and (i) the content lines unescaped and joined together. For example, if one run the Ruby script with the example shown in Section 2, one would obtain on the standard output the following result Riccardo Expires January 13, 2014 [Page 13] Internet-Draft Embedding Code July 2013 ! --((BGN))--Ggroup-name F"example.txt" This is an example of embedded file This is a very, very, very, very, very, very, very, very, very, very, very, very long line The line above is empty This line ends with 3 spaces ^IThis line begins with a tab and contains ^E^G non-printable characters ! --((END))-- where we used notation like ^I to make "visible" the non- printable characters. Note that from this output is quite easy to extract and save the real content with a simple editor. The reason for providing this bare-bone script is that being very short is quite easy to extract by hand so that it can be used as a help in extracting the sources of the full implementation. In a sense, the Ruby script acts as a "bootstrap" extractor. o A fully conformant implementation of an embedder and an extractor. The set of the sources is contains 15 source files, plus some auxiliary files (README, project files, ...). They can be easily extracted with the bare-bone script. A.1. Bare-bone extractor The following is the basic extractor in Ruby. Note that it does not use the embedding format described here since it is expected to be extracted manually. #!/usr/bin/env ruby def pad_body(padding, body) if padding == ' ' return body else return body + (" " * padding.to_i) end end def unescape(s) result = "" while pos = s.index('%') result += s[0...pos] raise "% Alone" if pos == s.length-1 Riccardo Expires January 13, 2014 [Page 14] Internet-Draft Embedding Code July 2013 if s[pos+1] == '.' s = s[pos+2..-1] elsif s[pos+1] == '%' result += '%' s = s[pos+2..-1] else result += s[pos+1..pos+2].to_i(16).chr s = s[pos+3..-1] end end return result + s end buffer = "" $stdin.each_line do |line| case line.strip when /^!\,([ 1-9])(.*)$/ buffer += pad_body($1, $2) when /^!\.([ 1-9])(.*)$/ buffer += pad_body($1, $2) puts unescape(buffer) buffer = "" when /^!\.$/ puts "" when /^!--/ puts line buffer = "" end end ## END OF CODE ## The use of the script above is very simple. Suppose you called it "basic_extractor.rb" and that you want to extract the code from "cool-rfc.txt" to "cool-sources.txt" you just do ruby basic_extractor.rb < cool-rfc.txt > cool-sources.txt Now you can extract the sources from cool-sources.txt with an editor and some patience... Riccardo Expires January 13, 2014 [Page 15] Internet-Draft Embedding Code July 2013 A.2. Full implentation A.2.1. Usage A.2.1.1. Embedder The command line syntax for the embedder is as follows rfc_embedding-embedder [-C] [-g] filename ... The command prints to the standard output the files listed on the command line embedded as described in this document. If option -C is present, every file is embedded in a CDATA block. If option -g is present, it specifies the group name to be used. A.2.1.2. Extractor The command line syntax for the extractor is as follows rfc_embedding-extractor [-g] [filename] The command reads the specified file (or the standard input, if filename is missing) and save the embedded files, as specified by their paths. If option -g is present, only the files of the specified group are extracted. A.2.2. Sources !--((BGN))--Gada F"rfc_embedding.ads" !. !. package RFC_Embedding is !. -- Root package !. end RFC_Embedding; !--((END))-- !--((BGN))--Gada F"rfc_embedding-command_parsing.adb" !. with Ada.Strings.Fixed; !. !. with Ada.Command_Line; !. use Ada; !. with Ada.Text_IO; use Ada.Text_IO; !. !. package body RFC_Embedding.Command_Parsing is !. Option_Char : constant Character := '-'; !. Riccardo Expires January 13, 2014 [Page 16] Internet-Draft Embedding Code July 2013 !. ------------------------ !. -- Parse_Command_Line -- !. ------------------------ !. !, function Parse_Command_Line (Spec : String) return Parsing_Re !. sult is !. use Ada.Strings.Fixed; !. !. type Spec_Entry is !. record !. Void : Boolean; !. Has_Arg : Boolean; !. end record; !. !, Void_Spec : constant Spec_Entry := (Void => True, others = !. > <>); !. !. type Parsed_Spec is array (Character) of Spec_Entry; !. !. function Parse_Spec (Spec : String) return Parsed_Spec is !. Result : Parsed_Spec := (others => Void_Spec); !. Cursor : Positive; !. Option : Character; !. begin !. Cursor := Spec'First; !. !. while Cursor <= Spec'Last loop !. Option := Spec (Cursor); !. Cursor := Cursor + 1; !. !. if not Result (Option).Void then !. raise Program_Error; !. end if; !. !,1 if Cursor <= Spec'Last and then Spec (Cursor) = ':' !. then !. Cursor := Cursor + 1; !, Result (Option) := (Void => False, Has_Arg => Tru !. e); !. else !, Result (Option) := (Void => False, Has_Arg => Fal !. se); !. end if; !. end loop; !. !. return Result; !. end Parse_Spec; !. Riccardo Expires January 13, 2014 [Page 17] Internet-Draft Embedding Code July 2013 !. Specs : Parsed_Spec := Parse_Spec (Spec); !. Result : Parsing_Result := !. (Table => (others => Void_Entry), !, Non_Option => Command_Line.Argument_Count + 1) !. ; !. Cursor : Positive; !. begin !. Cursor := 1; !. !. Option_Loop : !. while Cursor <= Command_Line.Argument_Count loop !. declare !. Option : String := Command_Line.Argument (Cursor); !. Name : Character; !. begin !. -- Put_Line ("Parsing '" & Option & "'"); !. !, if Option (Option'First) /= Option_Char or Option'Le !. ngth = 1 then !, -- Put_Line ("Exiting... con cursor = " & Cursor !. 'Img); !. Result.Non_Option := Cursor; !. exit Option_Loop; !. end if; !. !. Name := Option (Option'First + 1); !. !. if Name = Option_Char then !. -- Option '--' mark the end of the options !. !. Result.Non_Option := Cursor + 1; !. exit Option_Loop; !. end if; !. !. !. !. !. if Specs (Name).Void then !. raise Program_Error; !. end if; !. !. if Result.Table (Name).Given then !, raise Program_Error with "Option '" & Name & "' g !. iven twice"; !. end if; !. !. Result.Table (Name) := (Given => True, !, Value => Null_Unbounded_Stri Riccardo Expires January 13, 2014 [Page 18] Internet-Draft Embedding Code July 2013 !. ng); !. !. !. Cursor := Cursor + 1; !. !. if Specs (Name).Has_Arg then !. if Option'Length > 2 then !. Result.Table (Name).Value := !, To_Unbounded_String (Tail (Option, Option'Le !. ngth - 2)); !. else !. if Cursor > Command_Line.Argument_Count then !. raise Program_Error; !. else !. Result.Table (Name).Value := !, To_Unbounded_String (Command_Line.Argumen !. t (Cursor)); !. !. Cursor := Cursor + 1; !. end if; !. end if; !. else !. if Option'Length > 2 then !. raise Program_Error; !. end if; !. end if; !. end; !. end loop Option_Loop; !. !, -- Put_Line ("Primo utile : " & Integer'Image (Result.Non !. _Option)); !. return Result; !. end Parse_Command_Line; !. !. ---------------- !. -- Get_Option -- !. ---------------- !. !. function Get_Option !. (From : Parsing_Result; !. Name : Character; !. Default : String := "") !. return String !. is !. begin !. if From.Table (Name).Given then !. return To_String (From.Table (Name).Value); !. else Riccardo Expires January 13, 2014 [Page 19] Internet-Draft Embedding Code July 2013 !. return Default; !. end if; !. end Get_Option; !. !. !. -------------- !. -- Is_Given -- !. -------------- !. !. function Is_Given (Item : Parsing_Result; !. Name : Character) return Boolean is !. begin !. return Item.Table (Name).Given; !. end Is_Given; !. !. ---------------------- !. -- First_Non_Option -- !. ---------------------- !. !, function First_Non_Option (Item : Parsing_Result) return Nat !. ural is !. begin !. return Item.Non_Option; !. end First_Non_Option; !. !. end RFC_Embedding.Command_Parsing; !--((END))-- !--((BGN))--Gada F"rfc_embedding-command_parsing.ads" !. with Ada.Strings.Unbounded; use Ada.Strings.Unbounded; !. !. -- !. -- Yes, I confess, I am guilty of writing the Aleph_1-th !. -- package for parsing command lines. It must be said that !, -- the usage of this package is more convinient (in my opinion) !. !. -- than the usual getopt-like approach. !. -- !. -- The basic usage is as follows: !. -- !, -- * At the beginning of the program, the function Parse_Com !. mand_Line !, -- is called and it returns an object of type Parsing_Resu !. lt. Since !, -- Parsing_Result is declared to have unknown specifiers ( !. <>), a variable !, -- of this type must be initialized at declaration time, s Riccardo Expires January 13, 2014 [Page 20] Internet-Draft Embedding Code July 2013 !. o the !. -- usual calling sequence is !. -- !, -- CL_Options : Parsing_Result := Parse_Command_Line("g !. :xh"); !. -- !, -- * The Parsing_Result object is queried to check is some o !. ptions were !. -- specified and to get the value given to the option. !. -- !, -- Options begin with '-' and their name is just a single chara !. cter !, -- (multi-char names are not implmented), the option order does !. not matter, !. -- option values can be given both as in "-f115" or "-f 115", !, -- an option cannot be specified more than once, special option !. '--' !. -- can be used to mark the end of options. !. -- !. !. package RFC_Embedding.Command_Parsing is !. type Parsing_Result (<>) is tagged private; !. !, function Parse_Command_Line (Spec : String) return Parsing_Re !. sult; !, -- Parse the given command line and return the result in a v !. alue !, -- of type Parsing_Result. Spec specifies the recognized op !. tions. !, -- As with getopt, Spec is a sequence of option where ':' fo !. llows !. -- an option name if the option expects a parameter !. !. function Get_Option (From : Parsing_Result; !. Name : Character; !. Default : String := "") !. return String; !, -- Return the value associated to the specified option Name. !. If the !, -- option was not given on the command line, return the valu !. e !. -- of Default. !. !. !. !. function Is_Given (Item : Parsing_Result; !. Name : Character) return Boolean; !, -- Return true if the specified option was given on the comm Riccardo Expires January 13, 2014 [Page 21] Internet-Draft Embedding Code July 2013 !. and line !. !, function First_Non_Option (Item : Parsing_Result) return Nat !. ural; !. -- Return the index of the first non-option value !. !. private !. type Option_Entry is !. record !. Given : Boolean; !. Value : Unbounded_String; !. end record; !. !, Void_Entry : constant Option_Entry := (False, Null_Unbounded_ !. String); !. !. type Option_Array is array (Character) of Option_Entry; !. !. type Parsing_Result is tagged !. record !. Table : Option_Array; !. Non_Option : Positive; !. end record; !. end RFC_Embedding.Command_Parsing; !--((END))-- !--((BGN))--Gada F"rfc_embedding-embedder.adb" !. with RFC_Embedding.Syntax; !. with RFC_Embedding.Syntax.Escaping; !. !. with RFC_Embedding.Command_Parsing; !. !. with Ada.Directories; !. with Ada.Strings.Unbounded; !. with Ada.Strings.Fixed; !. with Ada.Text_IO.Unbounded_IO; !. with Ada.Command_Line; !. use Ada; !. use Ada.Strings.Unbounded; !. use Ada.Strings.Fixed; !. use Ada.Text_IO; !. !. procedure RFC_Embedding.Embedder is !. User_Error : exception; !. !. procedure Print_Embedded (Filename : String; !. Group : String; Riccardo Expires January 13, 2014 [Page 22] Internet-Draft Embedding Code July 2013 !. Use_Cdata : Boolean) is !. use Syntax, Syntax.Escaping; !. use Text_IO; !. use Text_IO.Unbounded_IO; !. use type Directories.File_Kind; !. !. Input : Text_IO.File_Type; !. begin !, if Directories.Kind (Filename) /= Directories.Ordinary_Fil !. e then !. return; !. end if; !. !. Text_IO.Open (File => Input, !. Mode => Text_IO.In_File, !. Name => Filename); !. !. if Use_CData then !. Put_Line (" Filename, !. Group => Group)); !. !. loop !. declare !. Lines : constant Line_Array := !, Make_Embedded_Lines (Escape (Get_Line (In !. put), "]]%.")); !. begin !. for I in Lines'Range loop !. Put_Line (Lines (I)); !. end loop; !. end; !. !. exit when End_Of_File (Input); !. end loop; !. !. Put_Line (Syntax.End_Line); !. !. if Use_CData then !. Put_Line ("]]%.>"); !. end if; !. !. end Print_Embedded; !. !. procedure Die_With_Help is !. begin Riccardo Expires January 13, 2014 [Page 23] Internet-Draft Embedding Code July 2013 !. New_Line (Standard_Error); !, Put_Line (Standard_Error, "Usage: embedder [options...] [f !. ilename...]"); !. New_Line (Standard_Error); !, Put_Line (Standard_Error, " If no filename is given on th !. e command"); !, Put_Line (Standard_Error, " line, the names are read from !. the "); !. Put_Line (Standard_Error, " standard input."); !. New_Line (Standard_Error); !. Put_Line (Standard_Error, " Options:"); !, Put_Line (Standard_Error, " -g Use group name" !. ); !, Put_Line (Standard_Error, " -C Put a CData she !. ll around the text"); !. New_Line (Standard_Error); !. !. raise User_Error; !. end Die_With_Help; !. !, procedure Get_Group_Name (Group_Name : out Unbounded_String !. ; !. First_Unused : out Positive) is !. begin !. if Command_Line.Argument_Count = 0 then !. Group_Name := Null_Unbounded_String; !. First_Unused := 1; !. return; !. end if; !. !. declare !. First_Arg : String := Command_Line.Argument (1); !. begin !. First_Unused := 1; !. !, if First_Arg'Length >= 2 and then Head (First_Arg, 2) = !. "-g" then !. if First_Arg'Length > 2 then !. Group_Name := !, To_Unbounded_String (Tail (First_Arg, First_Ar !. g'Length - 2)); !. !. First_Unused := 2; !. else !. if Command_Line.Argument_Count = 1 then !. Die_With_Help; !. end if; !. Riccardo Expires January 13, 2014 [Page 24] Internet-Draft Embedding Code July 2013 !, Group_Name := To_Unbounded_String (Command_Line.A !. rgument (2)); !. First_Unused := 3; !. end if; !. end if; !. end; !. end Get_Group_Name; !. !. First_Unused : Positive; !. Group_Name : Unbounded_String; !. Use_CData_Shell : Boolean; !. Options : Command_Parsing.Parsing_Result := !, Command_Parsing.Parse_Command_Line ("g:C" !. ); !. begin !, Group_Name := To_Unbounded_String (Options.Get_Option ('g', " !. ")); !. Use_CData_Shell := Options.Is_Given ('C'); !. !. First_Unused := Options.First_Non_Option; !. !. if Command_Line.Argument_Count >= First_Unused then !. for I in First_Unused .. Command_Line.Argument_Count loop !. Print_Embedded (Filename => Command_Line.Argument (I), !. Group => To_String (Group_Name), !. Use_CData => Use_CData_Shell); !. end loop; !. else !. loop !. declare !. Filename : String := Get_Line; !. begin !. Print_Embedded (Filename => Filename, !. Group => To_String (Group_Name), !. Use_CData => Use_CData_Shell); !. end; !. !. exit when End_Of_File; !. end loop; !. end if; !. !. exception !. when User_Error => !. Command_Line.Set_Exit_Status (Command_Line.Failure); !. end RFC_Embedding.Embedder; !--((END))-- Riccardo Expires January 13, 2014 [Page 25] Internet-Draft Embedding Code July 2013 !--((BGN))--Gada F"rfc_embedding-errors.adb" !. with Ada.Text_IO; use Ada.Text_IO; !. !. package body RFC_Embedding.Errors is !. Warnings_Are_Fatal : Boolean := False; !. !. procedure Raise_Error (Source : Error_Source) !. is !. begin !. case Source is !. when User => !. raise User_Error; !. when Format => !. raise Format_Error; !. end case; !. end Raise_Error; !. !. procedure Warning (Msg : String; !. Source : Error_Source := User) is !. begin !. Put_Line (Standard_Error, "Warning : " & Msg); !. !. if Warnings_Are_Fatal then !. Raise_Error (Source); !. end if; !. end Warning; !. !. procedure Die (Msg : String; !. Source : Error_Source := User) is !. begin !. Put_Line (Standard_Error, "Error : " & Msg); !. !. Raise_Error (Source); !. end Die; !. !. ------------------------- !. -- Make_Warnings_Fatal -- !. ------------------------- !. !. procedure Make_Warnings_Fatal (Status : Boolean) is !. begin !. Warnings_Are_Fatal := Status; !. end Make_Warnings_Fatal; !. !. end RFC_Embedding.Errors; !--((END))-- Riccardo Expires January 13, 2014 [Page 26] Internet-Draft Embedding Code July 2013 !--((BGN))--Gada F"rfc_embedding-errors.ads" !. !. package RFC_Embedding.Errors is !. type Error_Source is (User, Format); !. !. procedure Die (Msg : String; !. Source : Error_Source := User); !. !. procedure Warning (Msg : String; !. Source : Error_Source := User); !. !. procedure Make_Warnings_Fatal (Status : Boolean); !. !. User_Error : exception; !. Format_Error : exception; !. end RFC_Embedding.Errors; !--((END))-- !--((BGN))--Gada F"rfc_embedding-extractor.adb" !. with Ada.Strings.Fixed; !. with Ada.Strings.Maps.Constants; !. with Ada.Strings.Unbounded; !. use Ada.Strings.Unbounded; !. use Ada.Strings; !. use Ada; !. !. with Ada.Directories; !. !. with Ada.Characters.Handling; !. with Ada.Characters.Latin_1; !. use Ada.Characters.Latin_1; !. !. with Ada.Command_Line; !. use Ada.Command_Line; !. !. with Ada.Text_IO.Unbounded_IO; !. use Ada.Text_IO; !. !. with RFC_Embedding.Syntax.Escaping; !. with RFC_Embedding.Errors; !. !. use RFC_Embedding.Syntax.Escaping; !. use RFC_Embedding.Errors; !. !. with RFC_Embedding.Command_Parsing; !. use RFC_Embedding.Command_Parsing; !. Riccardo Expires January 13, 2014 [Page 27] Internet-Draft Embedding Code July 2013 !. procedure RFC_Embedding.Extractor is !. type Group_Pattern is new Unbounded_String; !. !. Any_Group : constant Group_Pattern := !. Group_Pattern (Null_Unbounded_String); !. !. type Status_Type is (In_Body, In_Content); !. !. type Content_Action is (Extract, Skip); !. !,1 function Read_Line (Input : Text_Io.File_Type) return String !. is !. Line : String := Text_Io.Get_Line (Input); !. From : Natural := Line'First; !. To : Natural := Line'Last; !. begin !. -- Ada is smart enough to handle the local end-of-line !, -- convention. Although according to RFC ????, the RFC t !. ext !, -- should use the local convention for end-of-line, it co !. uld happen !. -- that !. if Line (Line'First) = CR or Line (Line'First) = LF then !. From := From + 1; !. end if; !. !. if Line (Line'Last) = CR or Line (Line'Last) = LF then !. To := To - 1; !. end if; !. !. return Line (From .. To); !. end Read_Line; !. !. procedure Empty_Buffer (Output : Text_Io.File_Type; !. Buffer : in out Unbounded_String) !. is !. begin !. Put_Line (File => Output, !. Item => Unescape (To_String (Buffer))); !. !. Buffer := Null_Unbounded_String; !. end Empty_Buffer; !. !. function Is_Desired (Desired : Group_Pattern; !. Group : Syntax.Group_Name) !. return Boolean !. is !. use Syntax; Riccardo Expires January 13, 2014 [Page 28] Internet-Draft Embedding Code July 2013 !. begin !. Put_Line ("Desidered : '" & To_String (Desired) & "'" !. & "Got : '" & To_String (Group) & "'"); !. return !. Desired = Any_Group or !. Unbounded_String (Desired) = Unbounded_String (Group); !. end Is_Desired; !. !. !. Output : Text_Io.File_Type; !. Input : Text_Io.File_Type; !. Buffer : Unbounded_String := Null_Unbounded_String; !. Status : Status_Type := In_Body; !. Action : Content_Action; !. Class : Syntax.Line_Class; !. Content : Unbounded_String; !. Desired : Group_Pattern := Any_Group; !. Options : Parsing_Result := Parse_Command_Line ("g:"); !. begin !. if Options.Is_Given('g') then !, Desired := To_Unbounded_String (Options.Get_Option ('g')); !. !. end if; !. !, case Command_Line.Argument_Count - Options.First_Non_Option i !. s !. when -1 => !. null; !. !. when 0 => !. Text_Io.Open (File => Input, !. Mode => Text_Io.In_File, !, Name => Command_Line.Argument (Options.Fi !. rst_Non_Option)); !. !. Text_Io.Set_Input (Input); !. !. when others => !. raise Program_Error; !. end case; !. !. loop !. declare !. Line : String := Text_Io.Get_Line; !. begin !. case Status is !. when In_Body => !. if Syntax.Is_Begin_Line (Line) then Riccardo Expires January 13, 2014 [Page 29] Internet-Draft Embedding Code July 2013 !. Status := In_Content; !. !. declare !. use Ada.Directories; !. !. Group : Syntax.Group_Name; !. Path : Unbounded_String; !. Filename : Unbounded_String; !. begin !. Syntax.Parse_Begin_Line (Item => Line, !, Group => Group, !. !. Dir => Path, !, Filename => Filena !. me); !. !. if Path /= Null_Unbounded_String then !. Create_Path (To_String (Path)); !. end if; !. !. if Is_Desired (Desired, Group) then !. Action := Extract; !. !. Text_Io.Create !. (File => Output, !. Mode => Text_Io.Out_File, !. Name => Compose !,1 (Containing_Directory => To_String !. (Path), !,1 Name => To_String !. (Filename))); !. else !. Action := Skip; !. end if; !. end; !. end if; !. !. when In_Content => !. Syntax.Parse_Content_Line (Item => Line, !. Class => Class, !. Content => Content); !. !. case Class is !. when Syntax.Full_Content => !. if Action = Extract then !. Append (Buffer, Content); !. !. Empty_Buffer (Output, Buffer); Riccardo Expires January 13, 2014 [Page 30] Internet-Draft Embedding Code July 2013 !. end if; !. !. !. when Syntax.Partial_Content => !. if Action = Extract then !. Append (Buffer, Content); !. end if; !. !. when Syntax.End_Content => !. if Buffer /= Null_Unbounded_String then !, Warning ("End line after partial content !. line"); !. !. if Action = Extract then !. Empty_Buffer (Output, Buffer); !. end if; !. end if; !. !. if Action = Extract then !. Text_Io.Close (Output); !. end if; !. !. Status := In_Body; !. !. when Syntax.No_Content => !. null; !. end case; !. end case; !. end; !. !. if Text_Io.End_Of_File then !. if Status = In_Content then !. Warning ("End of file while in content"); !. !. if Action = Extract then !. if Buffer /= Null_Unbounded_String then !. Empty_Buffer (Output, Buffer); !. end if; !. !. Text_Io.Close (Output); !. end if; !. end if; !. !. exit; !. end if; !. end loop; !. !. exception Riccardo Expires January 13, 2014 [Page 31] Internet-Draft Embedding Code July 2013 !. when User_Error => !. Set_Exit_Status (Failure); !. end RFC_Embedding.Extractor; !--((END))-- !--((BGN))--Gada F"rfc_embedding-syntax.adb" !, with Ada.Strings.Fixed; use Ada, Ada.Strings; !. !. !. with RFC_Embedding.Syntax.Path_Syntax; !. with RFC_Embedding.Syntax.Escaping; !. !. !. package body RFC_Embedding.Syntax is !. Max_Block_Size : constant Integer := 64; !. Begin_Mark : constant String := "!--((BGN))--"; !. End_Mark : constant String := "!--((END))--"; !. Full_Line_Head : constant String := "!."; !. Partial_Line_Head : constant String := "!,"; !. !. Group_Marker : constant Character := 'G'; !. Filename_Marker : constant Character := 'F'; !. !, subtype Extended_Space_Length is Integer range -1 .. Max_Bloc !. k_Size; !, subtype Space_Length is Extended_Space_Length range 0 .. Max_ !. Block_Size; !. !. Space_Length_Encoding : array (Space_Length) of Character; !, Space_Length_Decoding : array (Character) of Extended_Space_L !. ength := !. (others => -1); !. !, pragma Assert (Full_Line_Head'Length = Partial_Line_Head'Leng !. th); !, pragma Assert (Full_Line_Head'Length + Max_Block_Size + 3 < 7 !. 2); !. !. ------------------- !. -- Is_Begin_Line -- !. ------------------- !. !. function Is_Begin_Line (Line : String) return Boolean is !. L : String := Fixed.Trim(Line, Strings.Left); !. begin !. return L'Length > Begin_Mark'Length and then !, L (L'First .. L'First + Begin_Mark'Length - 1) = Begin_ Riccardo Expires January 13, 2014 [Page 32] Internet-Draft Embedding Code July 2013 !. Mark; !. end Is_Begin_Line; !. !. ---------------------- !. -- Parse_Begin_Line -- !. ---------------------- !. !. procedure Parse_Begin_Line !. (Item : String; !. Group : out Group_Name; !. Dir : out Unbounded_String; !. Filename : out Unbounded_String) !. is !. use Ada.Strings.Fixed; !. !. procedure Extract_Group_Name (Line : String; !. Cursor : in out Natural; !. Group : out Group_Name) !. is !. Idx : Natural; !. begin !. Idx := Index (Source => Line, !. Pattern => " ", !. From => Cursor); !. !. if Idx = 0 or Idx = Cursor or Idx = Line'Last then !. raise Program_Error; !. end if; !. !, Group := To_Unbounded_String (Line (Cursor + 1 .. Idx - !. 1)); !. !. Cursor := Index_Non_Blank (Source => Line, !. From => Idx + 1); !. !. if Cursor = 0 then !. raise Program_Error; !. end if; !. end Extract_Group_Name; !. !. Line : String := Fixed.Trim (Item, Strings.Left); !. Cursor : Natural; !. begin !. if not Is_Begin_Line (Line) then !. raise Program_Error; !. end if; !. !. Cursor := Index_Non_Blank (Source => Line, Riccardo Expires January 13, 2014 [Page 33] Internet-Draft Embedding Code July 2013 !, From => Line'First + Begin_Ma !. rk'Length); !. !. if Cursor = 0 then !. raise Program_Error; !. end if; !. !. if Line (Cursor) = Group_Marker then !. Extract_Group_Name (Line, Cursor, Group); !. else !. Group := Group_Name (Null_Unbounded_String); !. end if; !. !, if Cursor + 2 > Line'Last or else Line (Cursor) /= Filenam !. e_Marker then !. raise Program_Error; !. end if; !. !. Cursor := Cursor + 1; !. !. declare !. use Path_Syntax; !. !. End_Char : constant String := Line (Cursor .. Cursor); !. Last : constant Natural := Index (Source => Line, !, Pattern => End_Ch !. ar, !, From => Cursor !. + 1); !. begin !. if Last = 0 or Last = Cursor + 1 then !. raise Program_Error; !. end if; !. !, To_Local_Syntax (Source => Line (Cursor + 1 .. Last - !. 1), !. Path => Dir, !. Filename => Filename); !. end; !. !. end Parse_Begin_Line; !. !. --------------------- !. -- Make_Begin_Line -- !. --------------------- !. !. function Make_Begin_Line !. (Full_Filename : String; Riccardo Expires January 13, 2014 [Page 34] Internet-Draft Embedding Code July 2013 !. Group : String := "") !. return String !. is !. use Ada.Strings.Fixed; !. use Path_Syntax; !. !. function Find_Unused (X : String) return Character is !. Candidates : constant String := """'%%$|,;:*^#"; !. begin !. for I in Candidates'Range loop !,1 if Index (Source => X, Pattern => Candidates (I .. !. I)) = 0 then !. return Candidates (I); !. end if; !. end loop; !. !. raise Program_Error; !. end Find_Unused; !. !, Result : Unbounded_String := To_Unbounded_String (Begin_Ma !. rk); !, Embedding_Name : String := To_Embedded_Syntax (Full_Filena !. me); !. Delim : Character := Find_Unused (Embedding_Name); !. begin !. if Group /= "" then !. if Index (Source => Group, Pattern => " ") /= 0 then !. raise Program_Error; !. end if; !. !. Result := Result & Group_Marker & Group & " "; !. end if; !. !. Result := Result !. & Filename_Marker !. & Delim !. & Embedding_Name !. & Delim; !. !. return To_String (Result); !. end Make_Begin_Line; !. !. -------------- !. -- End_Line -- !. -------------- !. !. function End_Line return String is !. begin Riccardo Expires January 13, 2014 [Page 35] Internet-Draft Embedding Code July 2013 !. return End_Mark; !. end End_Line; !. !. ------------------------ !. -- Parse_Content_Line -- !. ------------------------ !. !. procedure Parse_Content_Line !. (Item : String; !. Class : out Line_Class; !. Content : out Unbounded_String) !. is !. use Escaping; !. !. function Get_Line_Type (Line : String) return Line_Class !. is !, function Head_Is (Line : String; Other : String) return !. Boolean !. is !. begin !. if Line'Length < Other'Length then !. return False; !. else !, return Line (Line'First .. Line'First + Other'Len !. gth - 1) = Other; !. end if; !. end Head_Is; !. begin !. if Head_Is (Line, End_Mark) then !. return End_Content; !. !. elsif Head_Is (Line, Full_Line_Head) then !. return Full_Content; !. !. elsif Head_Is (Line, Partial_Line_Head) then !. return Partial_Content; !. !. else !. return No_Content; !. end if; !. end Get_Line_Type; !. !. Line : String := Fixed.Trim (Item, Strings.Left); !. Padding : Integer; !. begin !. Class := Get_Line_Type (Line); !. !. if Class = Partial_Content or Class = Full_Content then Riccardo Expires January 13, 2014 [Page 36] Internet-Draft Embedding Code July 2013 !. if Line'Length = Full_Line_Head'Length then !. if Class = Full_Content then !. Content := Null_Unbounded_String; !. return; !. else !. raise Program_Error; !. end if; !. end if; !. !, Padding := Space_Length_Decoding (Line (Line'First + Fu !. ll_Line_Head'Length)); !. !. if not (Padding in Space_Length) then !. raise Program_Error; !. end if; !. !. Content := To_Unbounded_String !, (Line (Line'First + Full_Line_Head'Length + 1 .. Line !. 'Last)); !. !. Content := Content & (Padding * " "); !. end if; !. end Parse_Content_Line; !. !. ------------------------- !. -- Make_Embedded_Lines -- !. ------------------------- !. !, function Make_Embedded_Lines (Line : String) return Line_Arra !. y is !. use Escaping; !. !, N_Blocks : constant Integer := 1 + Line'Length / Max_Blo !. ck_Size; !. !, function Make_Block (Item : String) return Unbounded_Strin !. g is !, Trimmed : constant String := Fixed.Trim (Item, Strings. !. Right); !, Padding : constant Space_Length := Item'Length - Trimme !. d'Length; !. begin !. if Item = "" then !. return Null_Unbounded_String; !. else !. return To_Unbounded_String !. (Space_Length_Encoding (Padding) & Trimmed); !. end if; Riccardo Expires January 13, 2014 [Page 37] Internet-Draft Embedding Code July 2013 !. end Make_Block; !. !. Result : Line_Array (1 .. N_Blocks); !. begin !. for Block in 1 .. N_Blocks - 1 loop !. Result (Block) := Partial_Line_Head & !, Make_Block (Line ((Block - 1) * Max_Block_Size + 1 .. B !. lock * Max_Block_Size)); !. end loop; !. !. Result (N_Blocks) := Full_Line_Head & !, Make_Block (Line ((N_Blocks - 1) * Max_Block_Size + 1 .. L !. ine'Last)); !. !. return Result; !. end Make_Embedded_Lines; !. begin !. Space_Length_Encoding (0) := ' '; !. !. for I in 1 .. 9 loop !,1 Space_Length_Encoding (I) := Character'Val (Character'Pos !. ('0')+I); !. end loop; !. !. for I in 0 .. 25 loop !, Space_Length_Encoding (I + 10) := Character'Val (Character !. 'Pos ('A')+I); !, Space_Length_Encoding (I + 36) := Character'Val (Character !. 'Pos ('a')+I); !. end loop; !. !. Space_Length_Encoding (62) := '@'; !. Space_Length_Encoding (63) := '|'; !. Space_Length_Encoding (64) := ':'; !. !. for I in Space_Length_Encoding'Range loop !. Space_Length_Decoding (Space_Length_Encoding (I)) := I; !. end loop; !. end RFC_Embedding.Syntax; !--((END))-- !--((BGN))--Gada F"rfc_embedding-syntax.ads" !. with Ada.Strings.Unbounded; use Ada.Strings.Unbounded; !. !. -- !. -- This package takes care of most of the syntactical aspects !. -- of the embedded format: embedded line headers, begin/end Riccardo Expires January 13, 2014 [Page 38] Internet-Draft Embedding Code July 2013 !. -- markers, groups and filenames and so on !. -- !. !. package RFC_Embedding.Syntax is !. type Group_Name is new Unbounded_String; !. !. type Line_Class is !. (Full_Content, Partial_Content, No_Content, End_Content); !. !. function Is_Begin_Line (Line : String)return Boolean; !. -- Return true if the line begins an embedded content !. !. !. procedure Parse_Begin_Line !. (Item : String; !. Group : out Group_Name; !. Dir : out Unbounded_String; !. Filename : out Unbounded_String); !. !. pragma Precondition (Is_Begin_Line (Item)); !, -- Parse a "begin line" and return the filename of the conte !. nt, an !, -- optional path and the optional group the content belong t !. o. The path !, -- and the filename follows the "local" convention; for exam !. ple, the path !. -- will be separated by '/' in *nix and by '\' in Windows. !. !. function Make_Begin_Line !. (Full_Filename : String; !. Group : String := "") !. return String; !, -- Create a "begin line" using the given filename (in the "l !. ocal" syntax) !. -- and optional group !. !, pragma Postcondition (Is_Begin_Line (Make_Begin_Line'Result)) !. ; !. !. function End_Line return String; !. -- Return the line used to mark the end of an embedded file !. !. procedure Parse_Content_Line !. (Item : String; !. Class : out Line_Class; !. Content : out Unbounded_String); !, -- Determine the class of the given line and return it in Cl !. ass. If Riccardo Expires January 13, 2014 [Page 39] Internet-Draft Embedding Code July 2013 !,1 -- the line is a content line (full or partial), return the !. "body" of !, -- the line in Content. The value of Content is unspecified !. if !. -- Class = No_content or End_Content !. !, type Line_Array is array (Positive range <>) of Unbounded_Str !. ing; !. !, function Make_Embedded_Lines (Line : String) return Line_Arra !. y; !, -- Take the content line and create one (or more, if necessa !. ry) lines !. -- with the correct marker at the begin !. !. end RFC_Embedding.Syntax; !--((END))-- !--((BGN))--Gada F"rfc_embedding-syntax-escaping.adb" !. with Ada.Characters.Handling; !. !. with Ada.Strings.Fixed; !. use Ada.Strings.Fixed; !. !. with RFC_Embedding.Errors; !. use RFC_Embedding.Errors; !. !. package body RFC_Embedding.Syntax.Escaping is !. !. ------------ !. -- Escape -- !. ------------ !. !. function Escape (X : String; !. Breakers : Breaker_Array) return String is !. use Ada.Strings.Unbounded; !. !. function To_Hex (C : Character) return String is !. Hex : constant String := "0123456789abcdef"; !. N : constant Integer := Character'Pos (C); !. begin !. return Hex ((N / 16)+1) & Hex ((N mod 16)+1); !. end To_Hex; !. !. procedure Break_After (Pattern : String; !. Item : in out Unbounded_String) !. is Riccardo Expires January 13, 2014 [Page 40] Internet-Draft Embedding Code July 2013 !. Tmp : String := To_String (Item); !. Cursor : Natural; !. Next : Natural; !. begin !. Item := Null_Unbounded_String; !. !. Cursor := 1; !. !. loop !. Next := Index (Source => Tmp, !. Pattern => Pattern, !. From => Cursor); !. !, exit when Next = 0 or Next = Tmp'Last - Pattern'Leng !. th - 1; !. !. Item := Item & Tmp (Cursor .. Next + 1) & "%%."; !. Cursor := Next + Pattern'Length; !. end loop; !. !. Item := Item & Tmp (Cursor .. Tmp'Last); !. end Break_After; !. !. Result : Unbounded_String; !. begin !. for I in X'Range loop !. case X (I) is !. when ' ' .. '$' | '&' .. '~' => !. Result := Result & X (I); !. !. when '%%' => !. Result := Result & "%%%%"; !. !. -- when '>' => !. -- Result := Result & "%%>"; !. !. !. when others => !. Result := Result & '%%' & To_Hex (X (I)); !. end case; !. end loop; !. !. for I in Breakers'Range loop !. Break_After (To_String (Breakers (I)), Result); !. end loop; !. !. return To_String (Result); !. end Escape; Riccardo Expires January 13, 2014 [Page 41] Internet-Draft Embedding Code July 2013 !. !. function Escape (X : String; !. Breaker : String) return String is !. B : constant Breaker_Array (1 .. 1) := !. (1 => To_Unbounded_String (Breaker)); !. begin !. return Escape (X, B); !. end Escape; !. !. function Escape (X : String) return String is !. B : Breaker_Array (1 .. 0); !. begin !. return Escape (X, B); !. end Escape; !. !. -------------- !. -- Unescape -- !. -------------- !. !. function Unescape (X : String) return String !. is !. function To_Char (X : String) return Character; !. pragma Precondition (X'Length = 2); !. !. function To_Char (X : String) return Character is !. use Ada.Characters.Handling; !. begin !. if (not Is_Hexadecimal_Digit (X (X'First))) !, or (not Is_Hexadecimal_Digit (X (X'First + 1))) then !. !. raise Program_Error; !. end if; !. !, return Character'Val (Integer'Value ("16#" & X & "#")); !. !. end To_Char; !. !. Result : Unbounded_String; !. Cursor : Natural := X'First; !. begin !. !. loop !. exit when Cursor > X'Last; !. !. if X (Cursor) /= '%%' then !. Result := Result & X (Cursor); !. Cursor := Cursor + 1; !. Riccardo Expires January 13, 2014 [Page 42] Internet-Draft Embedding Code July 2013 !. else !. Cursor := Cursor + 1; !. !. if Cursor > X'Last then !. Die ("Lone '%%' at end of line"); !. end if; !. !. if X (Cursor) = '%%' then !. Result := Result & X (Cursor); !. Cursor := Cursor + 1; !. !. elsif X (Cursor) = '.' then !. Cursor := Cursor + 1; !. !. else !. if Cursor > X'Last - 1 then !. Die ("Wrong '%%' escape sequence"); !. end if; !. !, Result := Result & To_Char (X (Cursor .. Cursor + !. 1)); !. !. Cursor := Cursor + 2; !. end if; !. end if; !. end loop; !. !. return To_String (Result); !. end Unescape; !. !. !. end RFC_Embedding.Syntax.Escaping; !--((END))-- !--((BGN))--Gada F"rfc_embedding-syntax-escaping.ads" !. with Ada.Strings.Unbounded; !. use Ada.Strings.Unbounded; !. !. package RFC_Embedding.Syntax.Escaping is !. type Breaker_Array is !. array (Positive range <>) of Unbounded_String; !. !. function Escape (X : String; !. Breakers : Breaker_Array) return String; !, -- Replace non-printable characters and '%%' with the corres !. ponding !, -- escape sequence. After the replacement, if some entry of Riccardo Expires January 13, 2014 [Page 43] Internet-Draft Embedding Code July 2013 !. Breakers !,1 -- is present but not at the end of the escaped string, add !. after !, -- them the "null escape" sequence "%%.". This is useful to !. break !, -- specific character sequences that could me taken for "con !. trol" !,1 -- sequences in some context. For example, when escaping a !. string !, -- for XML embedding, one could want to have "]]%." in Break !. ers, so that !, -- by chance, the sequence "]]%.>" (that marks the end of CD !. ATA in XML) !, -- is in the input string, then it is converted to "]]%.%%.> !. ". !. !. function Escape (X : String; !. Breaker : String) return String; !, -- Equivalent to calling the function above with a one-eleme !. nt array !,1 -- Example: use Escape(X, "]]%.") to apply the XML escaping !. explained !. -- above. !. !. function Escape (X : String) return String; !. -- Equivalent to calling Escape with an empty array !. !. function Unescape (X : String) return String; !. -- Remove the escaping in X !. end RFC_Embedding.Syntax.Escaping; !--((END))-- !--((BGN))--Gada F"rfc_embedding-syntax-escaping-test.adb" !. with Ada.Text_IO; use Ada.Text_IO; !. !. procedure RFC_Embedding.Syntax.Escaping.Test is !. Source : String := !. "prova%%15" !. & Character'Val (5) !. & Character'Val (7) !. & "%%%%%%afgigi"; !. Escaped : String := Escape (Source); !. Unescaped : String := Unescape (Escaped); !. begin !. Put_Line (Escaped); !. !. if Unescaped = Source then Riccardo Expires January 13, 2014 [Page 44] Internet-Draft Embedding Code July 2013 !. Put_Line ("OK"); !. else !. Put_Line ("BAD"); !. end if; !. end RFC_Embedding.Syntax.Escaping.Test; !--((END))-- !--((BGN))--Gada F"rfc_embedding-syntax-path_syntax.adb" !. with Ada.Strings.Maps; use Ada.Strings.Maps; !. with Ada.Strings.Fixed; use Ada, Ada.Strings; !. with Ada.Directories; !. with Ada.Text_IO; use Ada.Text_IO; !. !. with RFC_Embedding.Errors; use RFC_Embedding.Errors; !. !. !. !. package body RFC_Embedding.Syntax.Path_Syntax is !. !. Embedder_Separator : constant Character := '/'; !. !. --------------------- !. -- To_Local_Syntax -- !. --------------------- !. !. procedure To_Local_Syntax !. (Source : String; !. Path : out Unbounded_String; !. Filename : out Unbounded_String) !. is !, function Find_Last_Separator (X : String) return Natural i !. s !. begin !. return Fixed.Index (Source => X, !, Pattern => Embedder_Separator & "", !. !. From => X'Last, !. Going => Backward); !. end Find_Last_Separator; !. !. function To_Local_Path (Path : String) return String; !. pragma Precondition !. (Path (Path'First) /= Embedder_Separator and !. Path (Path'Last) /= Embedder_Separator); !. !. function To_Local_Path (Path : String) return String is !, -- Converte a "pure path" (i.e., without filename to t Riccardo Expires January 13, 2014 [Page 45] Internet-Draft Embedding Code July 2013 !. he end !. -- to the local syntax. !. !. Split_Point : Natural := Find_Last_Separator (Path); !. begin !. if Split_Point = 0 then !. return Path; !. else !, pragma Assert (Split_Point > Path'First and Split_Po !. int < Path'Last); !. !. -- Yes, we are recursive. !. return Directories.Compose !, (Containing_Directory => To_Local_Path (Path (Pat !. h'First .. Split_Point - 1)), !, Name => Path (Split_Point + 1 .. !. Path'Last)); !. end if; !. end To_Local_Path; !. !. Split_Point : Natural; !. begin !. if Source (Source'First) = Embedder_Separator then !. declare !. Idx : Natural := Fixed.Index (Source => Source, !,1 Set => Maps.To_Set !. (Embedder_Separator), !, Test => Strings.Outs !. ide); !. begin !. if Idx = 0 then !. Die ("Root dir not allowed"); !. else !, Warning ("Absolute path (" & Source & ") found. I !. nitial '/' ignored"); !. !, To_Local_Syntax (Source (Idx .. Source'Last), Pat !. h, Filename); !. return; !. end if; !. end; !. end if; !. !, pragma Assert (Source (Source'First) /= Embedder_Separator !. ); !. !. Split_Point := Find_Last_Separator (Source); !. Riccardo Expires January 13, 2014 [Page 46] Internet-Draft Embedding Code July 2013 !. pragma Assert (Split_Point /= Source'First); !. !. if Split_Point = 0 then !. -- No path separator found: we have a pure filename !. Filename := To_Unbounded_String (Source); !. Path := Null_Unbounded_String; !. return; !. end if; !. !. !. if Split_Point = Source'Last then !. Die ("Pure directories not allowed"); !. end if; !. !,1 pragma Assert (Split_Point > Source'First and Split_Point !. < Source'Last); !. !, Path := To_Unbounded_String (To_Local_Path (Source (Source !. 'First .. Split_Point - 1))); !. !, Filename := To_Unbounded_String (Source (Split_Point + 1 . !. . Source'Last)); !. end To_Local_Syntax; !. !. !. ------------------------ !. -- To_Embedded_Syntax -- !. ------------------------ !. !. function To_Embedded_Syntax !. (Filename : String) !. return String !. is !. use Ada.Directories; !. !. function Is_Full_Name (X : String) return Boolean is !. begin !. return X = Full_Name (X); !. end Is_Full_Name; !. !. function Is_Simple_Name (X : String) return Boolean is !. begin !. return X = Simple_Name (X); !. end Is_Simple_Name; !. !. procedure Check_Part (X : String) is !. use Ada.Strings.Fixed; !. begin Riccardo Expires January 13, 2014 [Page 47] Internet-Draft Embedding Code July 2013 !, if Index (Source => X, Pattern => Embedder_Separator & !. "") /= 0 then !, Die ("Filenames cannot contain '" & Embedder_Separat !. or & "'"); !. end if; !. end Check_Part; !. begin !. if Is_Full_Name (Filename) then !. Die ("Filename '" & Filename & "' is absolute"); !. end if; !. !. if Is_Simple_Name (Filename) then !. Check_Part (Filename); !. return Filename; !. else !. declare !, Dir : constant String := Containing_Directory (Fi !. lename); !. Simple : constant String := Simple_Name (Filename); !. begin !. Check_Part (Simple); !. !. return To_Embedded_Syntax (Dir) !. & Embedder_Separator !. & Simple; !. end; !. end if; !. end To_Embedded_Syntax; !. !. end RFC_Embedding.Syntax.Path_Syntax; !--((END))-- !--((BGN))--Gada F"rfc_embedding-syntax-path_syntax.ads" !. !. -- !, -- In the embedded format the file path is stored in a format i !. ndependent !, -- on the source/target system. The embedded syntax is just Un !. ix-like: !, -- directories in the path are separated by '/' and absolute pa !. th names !,1-- are not allowed. This package provides two procedures that !. map !, -- back and forth from the local syntax to the embedding syntax !. . !. -- !. Riccardo Expires January 13, 2014 [Page 48] Internet-Draft Embedding Code July 2013 !. package RFC_Embedding.Syntax.Path_Syntax is !. procedure To_Local_Syntax !. (Source : String; !. Path : out Unbounded_String; !. Filename : out Unbounded_String); !, -- Parse Source as an embedded filename and return the "path !. " part !, -- and the "filename" part. If the embedded filename is "si !. mple" (i.e., !. -- it has no '/'), Path is the empty string. !. !. function To_Embedded_Syntax !. (Filename : String) !. return String; !, -- Parse the full path in the local syntax and return the co !. rresponding !. -- path in the embedding syntax. !. end RFC_Embedding.Syntax.Path_Syntax; !--((END))-- !--((BGN))--Gada F"rfc_embedding-syntax-path_syntax-test.adb" !. with Ada.Strings.Unbounded; !. use Ada.Strings.Unbounded; !. !. with Ada.Directories; !. use Ada.Directories; !. !. with Ada.Text_IO; !. use Ada.Text_IO; !. !. procedure RFC_Embedding.Syntax.Path_Syntax.Test is !. Name : constant String := "pippo.c"; !. Source : String := Current_Directory; !. Embedded : String := To_Embedded_Syntax !, (Compose (Containing_Directory => Source (2 .. Source'Last !. ), !. Name => Name)); !. !. Path : Unbounded_String; !. Filename : Unbounded_String; !. begin !. To_Local_Syntax (Source => Embedded, !. Path => Path, !. Filename => Filename); !. !. Put_Line ("Source = '" & Source & "'"); !. Put_Line ("Embedded = '" & Embedded & "'"); Riccardo Expires January 13, 2014 [Page 49] Internet-Draft Embedding Code July 2013 !. Put_Line ("Path = '" & To_String (Path) & "'"); !. Put_Line ("File = '" & To_String (Filename) & "'"); !. !, if Source (2 .. Source'Last) = To_String (Path) and Name = To !. _String (Filename) then !. Put_Line ("OK"); !. else !. Put_Line ("BAD"); !. end if; !. end RFC_Embedding.Syntax.Path_Syntax.Test; !--((END))-- !--((BGN))--Gada F"README" !. =================== !. == What is this? == !. =================== !. !. This is the source code of the embedder/extractor. !. !. ====================== !. == How do I use it? == !. ====================== !. !.2You need an Ada compiler, for example, the GNAT compiler. !. !. The main files are !. !. * rfc_embedding-embedder.adb !. * rfc_embedding-extractor.adb !. !,1If you have the GNAT compiler, you can compile everything using !. the !. provided project file. !--((END))-- !--((BGN))--Gada F"extractor.gpr" !. project Extractor is !. !, for Main use ("rfc_embedding-embedder.adb", "rfc_embedding-ex !, tractor.adb", "rfc_embedding-syntax-escaping-test.adb", "rfc_emb !. edding-syntax-path_syntax-test.adb"); !. for Object_Dir use "bin/"; !. !. package Compiler is !, for Default_Switches ("ada") use ("-gnato", "-fstack-check !. ", "-gnata", "-gnat05"); Riccardo Expires January 13, 2014 [Page 50] Internet-Draft Embedding Code July 2013 !. end Compiler; !. !. end Extractor; !--((END))-- Author's Address Riccardo Bernardini (editor) University of Udine Via delle Scienze, 208 Udine 33100 Italy Phone: +39 0432 55 8271 Email: riccardo.bernardini@uniud.it Riccardo Expires January 13, 2014 [Page 51]