Compare commits

..

25 Commits

Author SHA1 Message Date
Toru Niina
f5079a7892 Merge pull request #32 from ToruNiina/allow-deeper-table-before
Allow deeper table before
2019-03-06 12:03:14 +09:00
ToruNiina
d90ffb63c6 Merge branch 'master' into allow-deeper-table-before 2019-03-05 23:27:11 +09:00
ToruNiina
b0ed122214 fix: allow deeper table appeared before
allow the following toml file.
```toml
[a.b.c]
d = 10
[a]
e = 2.718
```
2019-03-05 23:25:25 +09:00
ToruNiina
d88521d63c feat: enable to change region of value
To allow the following toml file, we need to replace the region after
the more precise region is found.
```toml
[a.b.c]
d = 42
[a]
e = 2.71
```
If the precise region (here, [a]) is found, the region of `a` should be
`[a]`, not `[a.b.c]`. After `[a]` is defined, toml does not allow to
write `[a]` twice. To check it, we need to replace the region of values
to the precise one.
2019-03-04 15:01:28 +09:00
ToruNiina
2accc9d22c fix: diagnose, but not throw for unicode error
in 2.0.x and 2.1.0, README says "it shows warning" for invalid unicode
codepoints. So far, this library just show an error message in stderr
for this case. It is not good to change the behavior fatal in the next
minor release, 2.1.1, that includes patches and improved error msgs.
I will make it throw syntax_error after 2.2.0 for invalid unicode
codepoints. For now, I will keep it to be "warning".
2019-03-03 18:56:45 +09:00
Toru Niina
363927f489 Merge pull request #31 from ToruNiina/err-msg-inhomogenous-array
improve error messages for invalid arrays
2019-03-02 23:07:29 +09:00
ToruNiina
ae793fb631 feat: improve error message for invalid array 2019-03-02 17:56:16 +09:00
ToruNiina
944b83642a feat: make location to inherit region_base
To generate error message, it is better to have the same interface.
Also, location can be considered as a region having only one character.
2019-03-02 17:52:00 +09:00
Toru Niina
5a8e5dee73 Merge pull request #30 from ToruNiina/hotfix
small patches for parser
2019-03-02 16:11:31 +09:00
ToruNiina
7f870d5861 fix: diagnose invalid UTF-8 codepoints 2019-03-02 01:57:05 +09:00
ToruNiina
536b23dc84 fix: allow empty table in the middle of a file 2019-03-01 22:53:16 +09:00
ToruNiina
0c9806e99f fix: diagnose key after [table.key] pattern
the following is not a valid toml format.
```
[table] key = "value"
```
this commit enables to diagnose that pattern.
2019-03-01 22:37:52 +09:00
ToruNiina
5a92932019 fix: disallow invalid escape sequence 2019-03-01 22:13:32 +09:00
ToruNiina
e929d2f00f fix: allow empty input file (to be an empty table) 2019-02-27 12:30:57 +09:00
Toru Niina
1e1e4c06e8 Merge pull request #29 from ToruNiina/err-msg-dotted-key
Error message for getting values corresponds to dotted keys
2019-02-27 12:21:03 +09:00
ToruNiina
d0726db473 chore: merge branch master into err-msg-dotted-key 2019-02-27 01:26:36 +09:00
Toru Niina
4cdc15a824 Merge pull request #28 from xaxousis/master
Add location to error string in parse_* functions
2019-02-27 01:25:52 +09:00
ToruNiina
b36fdf2f54 refactor: remove internal fn not needed any more
The function was needed to copy region information from value to value,
for a useful error message. Because of the last few commits, the region
information about keys are passed to insert_nested_keys that requires
the function which is removed. And it turned out that the function is no
longer required. It is originally a workaround, so I removed it.
2019-02-27 01:07:00 +09:00
ToruNiina
73ba6b385f feat: use key-region to represent table
use x.y.z = 42
    ~~~ here as the region of the table `x.y`
2019-02-27 01:02:39 +09:00
ToruNiina
30d1639aa4 refactor: return region info from parse_kvpair 2019-02-27 00:21:05 +09:00
Quentin Khan
d82814fc86 Add location to error string in parse_* functions 2019-02-26 14:56:58 +01:00
ToruNiina
2220efd682 chore: show appvayor status of master branch
other branches might be unstable, so they might fail. It is good to show
the status of the stable branch, rather than the experimental branches.
2019-02-26 00:26:04 +09:00
ToruNiina
679b365cf7 feat: get region info when parsing keys
Error messages related to dotted keys looks weird. like:
 1 | a.b.c = 42
   |         ~~ in this table
The underlined token is not a table. This should be like the following.
 1 | a.b.c = 42
   | ~~~ in this table
To implement this, the region information is needed when the keys are
read. This commit add this functionality, though currently the region
information is not used yet.
2019-02-26 00:17:28 +09:00
ToruNiina
83bf83b6dd style: add braces to if and remove additional else 2019-02-19 02:56:15 +09:00
ToruNiina
321364c7c2 fix: format char in an error message correctly 2019-02-19 02:46:48 +09:00
8 changed files with 450 additions and 215 deletions

View File

@@ -2,7 +2,7 @@ toml11
======
[![Build Status](https://travis-ci.org/ToruNiina/toml11.svg?branch=master)](https://travis-ci.org/ToruNiina/toml11)
[![Build status](https://ci.appveyor.com/api/projects/status/m2n08a926asvg5mg?svg=true)](https://ci.appveyor.com/project/ToruNiina/toml11)
[![Build status](https://ci.appveyor.com/api/projects/status/m2n08a926asvg5mg/branch/master?svg=true)](https://ci.appveyor.com/project/ToruNiina/toml11/branch/master)
[![Version](https://img.shields.io/github/release/ToruNiina/toml11.svg?style=flat)](https://github.com/ToruNiina/toml11/releases)
[![License](https://img.shields.io/github/license/ToruNiina/toml11.svg?style=flat)](LICENSE)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.1209136.svg)](https://doi.org/10.5281/zenodo.1209136)

View File

@@ -13,23 +13,23 @@ using namespace detail;
BOOST_AUTO_TEST_CASE(test_bare_key)
{
TOML11_TEST_PARSE_EQUAL_VALUE(parse_key, "barekey", std::vector<key>(1, "barekey"));
TOML11_TEST_PARSE_EQUAL_VALUE(parse_key, "bare-key", std::vector<key>(1, "bare-key"));
TOML11_TEST_PARSE_EQUAL_VALUE(parse_key, "bare_key", std::vector<key>(1, "bare_key"));
TOML11_TEST_PARSE_EQUAL_VALUE(parse_key, "1234", std::vector<key>(1, "1234"));
TOML11_TEST_PARSE_EQUAL(parse_key, "barekey", std::vector<key>(1, "barekey"));
TOML11_TEST_PARSE_EQUAL(parse_key, "bare-key", std::vector<key>(1, "bare-key"));
TOML11_TEST_PARSE_EQUAL(parse_key, "bare_key", std::vector<key>(1, "bare_key"));
TOML11_TEST_PARSE_EQUAL(parse_key, "1234", std::vector<key>(1, "1234"));
}
BOOST_AUTO_TEST_CASE(test_quoted_key)
{
TOML11_TEST_PARSE_EQUAL_VALUE(parse_key, "\"127.0.0.1\"", std::vector<key>(1, "127.0.0.1" ));
TOML11_TEST_PARSE_EQUAL_VALUE(parse_key, "\"character encoding\"", std::vector<key>(1, "character encoding"));
TOML11_TEST_PARSE_EQUAL(parse_key, "\"127.0.0.1\"", std::vector<key>(1, "127.0.0.1" ));
TOML11_TEST_PARSE_EQUAL(parse_key, "\"character encoding\"", std::vector<key>(1, "character encoding"));
#if defined(_MSC_VER) || defined(__INTEL_COMPILER)
TOML11_TEST_PARSE_EQUAL_VALUE(parse_key, "\"\xCA\x8E\xC7\x9D\xCA\x9E\"", std::vector<key>(1, "\xCA\x8E\xC7\x9D\xCA\x9E"));
TOML11_TEST_PARSE_EQUAL(parse_key, "\"\xCA\x8E\xC7\x9D\xCA\x9E\"", std::vector<key>(1, "\xCA\x8E\xC7\x9D\xCA\x9E"));
#else
TOML11_TEST_PARSE_EQUAL_VALUE(parse_key, "\"ʎǝʞ\"", std::vector<key>(1, "ʎǝʞ" ));
TOML11_TEST_PARSE_EQUAL(parse_key, "\"ʎǝʞ\"", std::vector<key>(1, "ʎǝʞ" ));
#endif
TOML11_TEST_PARSE_EQUAL_VALUE(parse_key, "'key2'", std::vector<key>(1, "key2" ));
TOML11_TEST_PARSE_EQUAL_VALUE(parse_key, "'quoted \"value\"'", std::vector<key>(1, "quoted \"value\"" ));
TOML11_TEST_PARSE_EQUAL(parse_key, "'key2'", std::vector<key>(1, "key2" ));
TOML11_TEST_PARSE_EQUAL(parse_key, "'quoted \"value\"'", std::vector<key>(1, "quoted \"value\"" ));
}
BOOST_AUTO_TEST_CASE(test_dotted_key)
@@ -38,13 +38,13 @@ BOOST_AUTO_TEST_CASE(test_dotted_key)
std::vector<key> keys(2);
keys[0] = "physical";
keys[1] = "color";
TOML11_TEST_PARSE_EQUAL_VALUE(parse_key, "physical.color", keys);
TOML11_TEST_PARSE_EQUAL(parse_key, "physical.color", keys);
}
{
std::vector<key> keys(2);
keys[0] = "physical";
keys[1] = "shape";
TOML11_TEST_PARSE_EQUAL_VALUE(parse_key, "physical.shape", keys);
TOML11_TEST_PARSE_EQUAL(parse_key, "physical.shape", keys);
}
{
std::vector<key> keys(4);
@@ -52,12 +52,12 @@ BOOST_AUTO_TEST_CASE(test_dotted_key)
keys[1] = "y";
keys[2] = "z";
keys[3] = "w";
TOML11_TEST_PARSE_EQUAL_VALUE(parse_key, "x.y.z.w", keys);
TOML11_TEST_PARSE_EQUAL(parse_key, "x.y.z.w", keys);
}
{
std::vector<key> keys(2);
keys[0] = "site";
keys[1] = "google.com";
TOML11_TEST_PARSE_EQUAL_VALUE(parse_key, "site.\"google.com\"", keys);
TOML11_TEST_PARSE_EQUAL(parse_key, "site.\"google.com\"", keys);
}
}

View File

@@ -39,7 +39,7 @@ inline std::string show_char(const char c)
else
{
std::ostringstream oss;
oss << std::hex << std::setfill('0') << std::setw(2) << "0x"
oss << "0x" << std::hex << std::setfill('0') << std::setw(2)
<< static_cast<int>(c);
return oss.str();
}

View File

@@ -124,9 +124,9 @@ using lex_escape_unicode_short = sequence<character<'u'>,
using lex_escape_unicode_long = sequence<character<'U'>,
repeat<lex_hex_dig, exactly<8>>>;
using lex_escape_seq_char = either<character<'"'>, character<'\\'>,
character<'/'>, character<'b'>,
character<'f'>, character<'n'>,
character<'r'>, character<'t'>,
character<'b'>, character<'f'>,
character<'n'>, character<'r'>,
character<'t'>,
lex_escape_unicode_short,
lex_escape_unicode_long
>;

View File

@@ -34,7 +34,7 @@ parse_boolean(location<Container>& loc)
}
}
loc.iter() = first; //rollback
return err(std::string("[error] toml::parse_boolean: "
return err(format_underline("[error] toml::parse_boolean: ", loc,
"the next token is not a boolean"));
}
@@ -63,7 +63,7 @@ parse_binary_integer(location<Container>& loc)
return ok(std::make_pair(retval, token.unwrap()));
}
loc.iter() = first;
return err(std::string("[error] toml::parse_binary_integer:"
return err(format_underline("[error] toml::parse_binary_integer:", loc,
"the next token is not an integer"));
}
@@ -84,7 +84,7 @@ parse_octal_integer(location<Container>& loc)
return ok(std::make_pair(retval, token.unwrap()));
}
loc.iter() = first;
return err(std::string("[error] toml::parse_octal_integer:"
return err(format_underline("[error] toml::parse_octal_integer:", loc,
"the next token is not an integer"));
}
@@ -105,7 +105,7 @@ parse_hexadecimal_integer(location<Container>& loc)
return ok(std::make_pair(retval, token.unwrap()));
}
loc.iter() = first;
return err(std::string("[error] toml::parse_hexadecimal_integer"
return err(format_underline("[error] toml::parse_hexadecimal_integer", loc,
"the next token is not an integer"));
}
@@ -133,7 +133,7 @@ parse_integer(location<Container>& loc)
return ok(std::make_pair(retval, token.unwrap()));
}
loc.iter() = first;
return err(std::string("[error] toml::parse_integer: "
return err(format_underline("[error] toml::parse_integer: ", loc,
"the next token is not an integer"));
}
@@ -222,12 +222,13 @@ parse_floating(location<Container>& loc)
return ok(std::make_pair(v, token.unwrap()));
}
loc.iter() = first;
return err(std::string("[error] toml::parse_floating: "
return err(format_underline("[error] toml::parse_floating: ", loc,
"the next token is not a float"));
}
template<typename Container>
std::string read_utf8_codepoint(const region<Container>& reg)
template<typename Container, typename Container2>
std::string read_utf8_codepoint(const region<Container>& reg,
/* for err msg */ const location<Container2>& loc)
{
const auto str = reg.str().substr(1);
std::uint_least32_t codepoint;
@@ -247,20 +248,27 @@ std::string read_utf8_codepoint(const region<Container>& reg)
}
else if(codepoint < 0x10000) // U+0800...U+FFFF
{
if(0xD800 <= codepoint && codepoint <= 0xDFFF)
{
std::cerr << format_underline("[warning] "
"toml::read_utf8_codepoint: codepoints in the range "
"[0xD800, 0xDFFF] are not valid UTF-8.",
loc, "not a valid UTF-8 codepoint") << std::endl;
}
assert(codepoint < 0xD800 || 0xDFFF < codepoint);
// 1110yyyy 10yxxxxx 10xxxxxx
character += static_cast<unsigned char>(0xE0| codepoint >> 12);
character += static_cast<unsigned char>(0x80|(codepoint >> 6 & 0x3F));
character += static_cast<unsigned char>(0x80|(codepoint & 0x3F));
}
else if(codepoint < 0x200000) // U+10000 ... U+1FFFFF
else if(codepoint < 0x200000) // U+010000 ... U+1FFFFF
{
if(0x10FFFF < codepoint) // out of Unicode region
{
std::cerr << format_underline(concat_to_string("[warning] "
"input codepoint (", str, ") is too large to decode as "
"a unicode character. The result may not be able to render "
"to your screen."), reg, "should be in [0x00..0x10FFFF]")
<< std::endl;
std::cerr << format_underline("[error] "
"toml::read_utf8_codepoint: input codepoint is too large to "
"decode as a unicode character.", loc,
"should be in [0x00..0x10FFFF]") << std::endl;
}
// 11110yyy 10yyxxxx 10xxxxxx 10xxxxxx
character += static_cast<unsigned char>(0xF0| codepoint >> 18);
@@ -283,7 +291,7 @@ result<std::string, std::string> parse_escape_sequence(location<Container>& loc)
const auto first = loc.iter();
if(first == loc.end() || *first != '\\')
{
return err(std::string("[error]: toml::parse_escape_sequence: "
return err(format_underline("[error]: toml::parse_escape_sequence: ", loc,
"the next token is not an escape sequence \"\\\""));
}
++loc.iter();
@@ -300,7 +308,7 @@ result<std::string, std::string> parse_escape_sequence(location<Container>& loc)
{
if(const auto token = lex_escape_unicode_short::invoke(loc))
{
return ok(read_utf8_codepoint(token.unwrap()));
return ok(read_utf8_codepoint(token.unwrap(), loc));
}
else
{
@@ -313,7 +321,7 @@ result<std::string, std::string> parse_escape_sequence(location<Container>& loc)
{
if(const auto token = lex_escape_unicode_long::invoke(loc))
{
return ok(read_utf8_codepoint(token.unwrap()));
return ok(read_utf8_codepoint(token.unwrap(), loc));
}
else
{
@@ -523,7 +531,7 @@ parse_string(location<Container>& loc)
if(const auto rslt = parse_ml_literal_string(loc)) {return rslt;}
if(const auto rslt = parse_basic_string(loc)) {return rslt;}
if(const auto rslt = parse_literal_string(loc)) {return rslt;}
return err(std::string("[error] toml::parse_string: "
return err(format_underline("[error] toml::parse_string: ", loc,
"the next token is not a string"));
}
@@ -573,7 +581,7 @@ parse_local_date(location<Container>& loc)
else
{
loc.iter() = first;
return err(std::string("[error]: toml::parse_local_date: "
return err(format_underline("[error]: toml::parse_local_date: ", loc,
"the next token is not a local_date"));
}
}
@@ -656,7 +664,7 @@ parse_local_time(location<Container>& loc)
else
{
loc.iter() = first;
return err(std::string("[error]: toml::parse_local_time: "
return err(format_underline("[error]: toml::parse_local_time: ", loc,
"the next token is not a local_time"));
}
}
@@ -699,7 +707,7 @@ parse_local_datetime(location<Container>& loc)
else
{
loc.iter() = first;
return err(std::string("[error]: toml::parse_local_datetime: "
return err(format_underline("[error]: toml::parse_local_datetime: ", loc,
"the next token is not a local_datetime"));
}
}
@@ -748,39 +756,43 @@ parse_offset_datetime(location<Container>& loc)
else
{
loc.iter() = first;
return err(std::string("[error]: toml::parse_offset_datetime: "
return err(format_underline("[error]: toml::parse_offset_datetime: ", loc,
"the next token is not a local_datetime"));
}
}
template<typename Container>
result<key, std::string> parse_simple_key(location<Container>& loc)
result<std::pair<key, region<Container>>, std::string>
parse_simple_key(location<Container>& loc)
{
if(const auto bstr = parse_basic_string(loc))
{
return ok(bstr.unwrap().first.str);
return ok(std::make_pair(bstr.unwrap().first.str, bstr.unwrap().second));
}
if(const auto lstr = parse_literal_string(loc))
{
return ok(lstr.unwrap().first.str);
return ok(std::make_pair(lstr.unwrap().first.str, lstr.unwrap().second));
}
if(const auto bare = lex_unquoted_key::invoke(loc))
{
return ok(bare.unwrap().str());
const auto reg = bare.unwrap();
return ok(std::make_pair(reg.str(), reg));
}
return err(std::string("[error] toml::parse_simple_key: "
return err(format_underline("[error] toml::parse_simple_key: ", loc,
"the next token is not a simple key"));
}
// dotted key become vector of keys
template<typename Container>
result<std::vector<key>, std::string> parse_key(location<Container>& loc)
result<std::pair<std::vector<key>, region<Container>>, std::string>
parse_key(location<Container>& loc)
{
const auto first = loc.iter();
// dotted key -> foo.bar.baz whitespaces are allowed
if(const auto token = lex_dotted_key::invoke(loc))
{
location<std::string> inner_loc(loc.name(), token.unwrap().str());
const auto reg = token.unwrap();
location<std::string> inner_loc(loc.name(), reg.str());
std::vector<key> keys;
while(inner_loc.iter() != inner_loc.end())
@@ -788,7 +800,7 @@ result<std::vector<key>, std::string> parse_key(location<Container>& loc)
lex_ws::invoke(inner_loc);
if(const auto k = parse_simple_key(inner_loc))
{
keys.push_back(k.unwrap());
keys.push_back(k.unwrap().first);
}
else
{
@@ -813,17 +825,17 @@ result<std::vector<key>, std::string> parse_key(location<Container>& loc)
"should be `.`"));
}
}
return ok(keys);
return ok(std::make_pair(keys, reg));
}
loc.iter() = first;
// simple key -> foo
if(const auto smpl = parse_simple_key(loc))
{
return ok(std::vector<key>(1, smpl.unwrap()));
return ok(std::make_pair(std::vector<key>(1, smpl.unwrap().first),
smpl.unwrap().second));
}
return err(std::string("[error] toml::parse_key: "
"the next token is not a key"));
return err(format_underline("[error] toml::parse_key: ", loc, "is not a valid key"));
}
// forward-decl to implement parse_array and parse_table
@@ -864,16 +876,39 @@ parse_array(location<Container>& loc)
{
if(!retval.empty() && retval.front().type() != val.as_ok().type())
{
throw syntax_error(format_underline(
"[error] toml::parse_array: type of elements should be the "
"same each other.", region<Container>(loc, first, loc.iter()),
"inhomogeneous types"));
auto array_start_loc = loc;
array_start_loc.iter() = first;
throw syntax_error(format_underline("[error] toml::parse_array: "
"type of elements should be the same each other.",
std::vector<std::pair<region_base const*, std::string>>{
std::make_pair(
std::addressof(array_start_loc),
std::string("array starts here")
),
std::make_pair(
std::addressof(get_region(retval.front())),
std::string("value has type ") +
stringize(retval.front().type())
),
std::make_pair(
std::addressof(get_region(val.unwrap())),
std::string("value has different type, ") +
stringize(val.unwrap().type())
)
}));
}
retval.push_back(std::move(val.unwrap()));
}
else
{
return err(val.unwrap_err());
auto array_start_loc = loc;
array_start_loc.iter() = first;
throw syntax_error(format_underline("[error] toml::parse_array: "
"value having invalid format appeared in an array",
array_start_loc, "array starts here",
loc, "it is not a valid value."));
}
using lex_array_separator = sequence<maybe<lex_ws>, character<','>>;
@@ -889,8 +924,12 @@ parse_array(location<Container>& loc)
}
else
{
auto array_start_loc = loc;
array_start_loc.iter() = first;
throw syntax_error(format_underline("[error] toml::parse_array:"
" missing array separator `,`", loc, "should be `,`"));
" missing array separator `,` after a value",
array_start_loc, "array starts here", loc, "should be `,`"));
}
}
}
@@ -900,14 +939,14 @@ parse_array(location<Container>& loc)
}
template<typename Container>
result<std::pair<std::vector<key>, value>, std::string>
result<std::pair<std::pair<std::vector<key>, region<Container>>, value>, std::string>
parse_key_value_pair(location<Container>& loc)
{
const auto first = loc.iter();
auto key = parse_key(loc);
if(!key)
auto key_reg = parse_key(loc);
if(!key_reg)
{
std::string msg = std::move(key.unwrap_err());
std::string msg = std::move(key_reg.unwrap_err());
// if the next token is keyvalue-separator, it means that there are no
// key. then we need to show error as "empty key is not allowed".
if(const auto keyval_sep = lex_keyval_sep::invoke(loc))
@@ -948,6 +987,7 @@ parse_key_value_pair(location<Container>& loc)
{
std::string msg;
loc.iter() = after_kvsp;
// check there is something not a comment/whitespace after `=`
if(sequence<maybe<lex_ws>, maybe<lex_comment>, lex_newline>::invoke(loc))
{
loc.iter() = after_kvsp;
@@ -955,15 +995,15 @@ parse_key_value_pair(location<Container>& loc)
"missing value after key-value separator '='", loc,
"expected value, but got nothing");
}
else
else // there is something not a comment/whitespace, so invalid format.
{
msg = format_underline("[error] toml::parse_key_value_pair: "
"invalid value format", loc, val.unwrap_err());
msg = std::move(val.unwrap_err());
}
loc.iter() = first;
return err(msg);
}
return ok(std::make_pair(std::move(key.unwrap()), std::move(val.unwrap())));
return ok(std::make_pair(std::move(key_reg.unwrap()),
std::move(val.unwrap())));
}
// for error messages.
@@ -982,10 +1022,80 @@ std::string format_dotted_keys(InputIterator first, const InputIterator last)
return retval;
}
template<typename InputIterator>
// forward decl for is_valid_forward_table_definition
template<typename Container>
result<std::pair<std::vector<key>, region<Container>>, std::string>
parse_table_key(location<Container>& loc);
// The following toml file is allowed.
// ```toml
// [a.b.c] # here, table `a` has element `b`.
// foo = "bar"
// [a] # merge a = {baz = "qux"} to a = {b = {...}}
// baz = "qux"
// ```
// But the following is not allowed.
// ```toml
// [a]
// b.c.foo = "bar"
// [a] # error! the same table [a] defined!
// baz = "qux"
// ```
// The following is neither allowed.
// ```toml
// a = { b.c.foo = "bar"}
// [a] # error! the same table [a] defined!
// baz = "qux"
// ```
// Here, it parses region of `tab->at(k)` as a table key and check the depth
// of the key. If the key region points deeper node, it would be allowed.
// Otherwise, the key points the same node. It would be rejected.
template<typename Iterator>
bool is_valid_forward_table_definition(const value& fwd,
Iterator key_first, Iterator key_curr, Iterator key_last)
{
location<std::string> def("internal", detail::get_region(fwd).str());
if(const auto tabkeys = parse_table_key(def))
{
// table keys always contains all the nodes from the root.
const auto& tks = tabkeys.unwrap().first;
if(std::distance(key_first, key_last) == tks.size() &&
std::equal(tks.begin(), tks.end(), key_first))
{
// the keys are equivalent. it is not allowed.
return false;
}
// the keys are not equivalent. it is allowed.
return true;
}
if(const auto dotkeys = parse_key(def))
{
// consider the following case.
// [a]
// b.c = {d = 42}
// [a.b.c]
// e = 2.71
// this defines the table [a.b.c] twice. no?
// a dotted key starts from the node representing a table in which the
// dotted key belongs to.
const auto& dks = dotkeys.unwrap().first;
if(std::distance(key_curr, key_last) == dks.size() &&
std::equal(dks.begin(), dks.end(), key_curr))
{
// the keys are equivalent. it is not allowed.
return false;
}
// the keys are not equivalent. it is allowed.
return true;
}
return false;
}
template<typename InputIterator, typename Container>
result<bool, std::string>
insert_nested_key(table& root, const toml::value& v,
InputIterator iter, const InputIterator last,
region<Container> key_reg,
const bool is_array_of_table = false)
{
static_assert(std::is_same<key,
@@ -1067,24 +1177,38 @@ insert_nested_key(table& root, const toml::value& v,
}
else // if not, we need to create the array of table
{
toml::value aot(v); // copy region info from table to array
// update content by an array with one element
detail::assign_keeping_region(aot,
::toml::value(toml::array(1, v)));
toml::value aot(toml::array(1, v), key_reg);
tab->insert(std::make_pair(k, aot));
return ok(true);
}
}
} // end if(array of table)
if(tab->count(k) == 1)
{
if(tab->at(k).is(value_t::Table) && v.is(value_t::Table))
{
throw syntax_error(format_underline(concat_to_string(
"[error] toml::insert_value: table (\"",
format_dotted_keys(first, last), "\") already exists."),
get_region(tab->at(k)), "table already exists here",
get_region(v), "table defined twice"));
if(!is_valid_forward_table_definition(
tab->at(k), first, iter, last))
{
throw syntax_error(format_underline(concat_to_string(
"[error] toml::insert_value: table (\"",
format_dotted_keys(first, last),
"\") already exists."),
get_region(tab->at(k)), "table already exists here",
get_region(v), "table defined twice"));
}
// to allow the following toml file.
// [a.b.c]
// d = 42
// [a]
// e = 2.71
auto& t = tab->at(k).cast<value_t::Table>();
for(const auto& kv : v.cast<value_t::Table>())
{
t[kv.first] = kv.second;
}
detail::change_region(tab->at(k), key_reg);
return ok(true);
}
else if(v.is(value_t::Table) &&
tab->at(k).is(value_t::Array) &&
@@ -1104,7 +1228,7 @@ insert_nested_key(table& root, const toml::value& v,
"[error] toml::insert_value: value (\"",
format_dotted_keys(first, last), "\") already exists."),
get_region(tab->at(k)), "value already exists here",
get_region(v), "value inserted twice"));
get_region(v), "value defined twice"));
}
}
tab->insert(std::make_pair(k, v));
@@ -1120,10 +1244,7 @@ insert_nested_key(table& root, const toml::value& v,
// [x.y.z]
if(tab->count(k) == 0)
{
// the region of [x.y] is the same as [x.y.z].
(*tab)[k] = v; // copy region_info_
detail::assign_keeping_region((*tab)[k],
::toml::value(::toml::table{}));
(*tab)[k] = toml::value(toml::table{}, key_reg);
}
// type checking...
@@ -1167,7 +1288,7 @@ parse_inline_table(location<Container>& loc)
table retval;
if(!(loc.iter() != loc.end() && *loc.iter() == '{'))
{
return err(std::string("[error] toml::parse_inline_table: "
return err(format_underline("[error] toml::parse_inline_table: ", loc,
"the next token is not an inline table"));
}
++loc.iter();
@@ -1187,11 +1308,12 @@ parse_inline_table(location<Container>& loc)
{
return err(kv_r.unwrap_err());
}
const std::vector<key>& keys = kv_r.unwrap().first;
const value& val = kv_r.unwrap().second;
const std::vector<key>& keys = kv_r.unwrap().first.first;
const region<Container>& key_reg = kv_r.unwrap().first.second;
const value& val = kv_r.unwrap().second;
const auto inserted =
insert_nested_key(retval, val, keys.begin(), keys.end());
insert_nested_key(retval, val, keys.begin(), keys.end(), key_reg);
if(!inserted)
{
throw internal_error("[error] toml::parse_inline_table: "
@@ -1228,7 +1350,7 @@ result<value, std::string> parse_value(location<Container>& loc)
const auto first = loc.iter();
if(first == loc.end())
{
return err(std::string("[error] toml::parse_value: input is empty"));
return err(format_underline("[error] toml::parse_value: input is empty", loc, ""));
}
if(auto r = parse_string (loc))
{return ok(value(std::move(r.unwrap().first), std::move(r.unwrap().second)));}
@@ -1289,7 +1411,21 @@ parse_table_key(location<Container>& loc)
throw internal_error(format_underline("[error] "
"toml::parse_table_key: no `]`", inner_loc, "should be `]`"));
}
return ok(std::make_pair(keys.unwrap(), token.unwrap()));
// after [table.key], newline or EOF(empty table) requried.
if(loc.iter() != loc.end())
{
using lex_newline_after_table_key =
sequence<maybe<lex_ws>, maybe<lex_comment>, lex_newline>;
const auto nl = lex_newline_after_table_key::invoke(loc);
if(!nl)
{
throw syntax_error(format_underline("[error] "
"toml::parse_table_key: newline required after [table.key]",
loc, "expected newline"));
}
}
return ok(std::make_pair(keys.unwrap().first, token.unwrap()));
}
else
{
@@ -1327,7 +1463,21 @@ parse_array_table_key(location<Container>& loc)
throw internal_error(format_underline("[error] "
"toml::parse_table_key: no `]]`", inner_loc, "should be `]]`"));
}
return ok(std::make_pair(keys.unwrap(), token.unwrap()));
// after [[table.key]], newline or EOF(empty table) requried.
if(loc.iter() != loc.end())
{
using lex_newline_after_table_key =
sequence<maybe<lex_ws>, maybe<lex_comment>, lex_newline>;
const auto nl = lex_newline_after_table_key::invoke(loc);
if(!nl)
{
throw syntax_error(format_underline("[error] "
"toml::parse_array_table_key: newline required after "
"[[table.key]]", loc, "expected newline"));
}
}
return ok(std::make_pair(keys.unwrap().first, token.unwrap()));
}
else
{
@@ -1342,7 +1492,7 @@ result<table, std::string> parse_ml_table(location<Container>& loc)
const auto first = loc.iter();
if(first == loc.end())
{
return err(std::string("toml::parse_ml_table: input is empty"));
return ok(toml::table{});
}
// XXX at lest one newline is needed.
@@ -1368,10 +1518,11 @@ result<table, std::string> parse_ml_table(location<Container>& loc)
if(const auto kv = parse_key_value_pair(loc))
{
const std::vector<key>& keys = kv.unwrap().first;
const value& val = kv.unwrap().second;
const std::vector<key>& keys = kv.unwrap().first.first;
const region<Container>& key_reg = kv.unwrap().first.second;
const value& val = kv.unwrap().second;
const auto inserted =
insert_nested_key(tab, val, keys.begin(), keys.end());
insert_nested_key(tab, val, keys.begin(), keys.end(), key_reg);
if(!inserted)
{
return err(inserted.unwrap_err());
@@ -1420,11 +1571,11 @@ result<table, std::string> parse_toml_file(location<Container>& loc)
const auto first = loc.iter();
if(first == loc.end())
{
return err(std::string("toml::detail::parse_toml_file: input is empty"));
return ok(toml::table{});
}
table data;
/* root object is also table, but without [tablename] */
// root object is also a table, but without [tablename]
if(auto tab = parse_ml_table(loc))
{
data = std::move(tab.unwrap());
@@ -1449,7 +1600,8 @@ result<table, std::string> parse_toml_file(location<Container>& loc)
const auto inserted = insert_nested_key(data,
toml::value(tab.unwrap(), reg),
keys.begin(), keys.end(), /*is_array_of_table=*/ true);
keys.begin(), keys.end(), reg,
/*is_array_of_table=*/ true);
if(!inserted) {return err(inserted.unwrap_err());}
continue;
@@ -1463,7 +1615,7 @@ result<table, std::string> parse_toml_file(location<Container>& loc)
const auto& reg = tabkey.unwrap().second;
const auto inserted = insert_nested_key(data,
toml::value(tab.unwrap(), reg), keys.begin(), keys.end());
toml::value(tab.unwrap(), reg), keys.begin(), keys.end(), reg);
if(!inserted) {return err(inserted.unwrap_err());}
continue;

View File

@@ -28,44 +28,6 @@ inline std::string make_string(std::size_t len, char c)
return std::string(len, c);
}
// location in a container, normally in a file content.
// shared_ptr points the resource that the iter points.
// it can be used not only for resource handling, but also error message.
template<typename Container>
struct location
{
static_assert(std::is_same<char, typename Container::value_type>::value,"");
using const_iterator = typename Container::const_iterator;
using source_ptr = std::shared_ptr<const Container>;
location(std::string name, Container cont)
: source_(std::make_shared<Container>(std::move(cont))),
source_name_(std::move(name)), iter_(source_->cbegin())
{}
location(const location&) = default;
location(location&&) = default;
location& operator=(const location&) = default;
location& operator=(location&&) = default;
~location() = default;
const_iterator& iter() noexcept {return iter_;}
const_iterator iter() const noexcept {return iter_;}
const_iterator begin() const noexcept {return source_->cbegin();}
const_iterator end() const noexcept {return source_->cend();}
source_ptr const& source() const& noexcept {return source_;}
source_ptr&& source() && noexcept {return std::move(source_);}
std::string const& name() const noexcept {return source_name_;}
private:
source_ptr source_;
std::string source_name_;
const_iterator iter_;
};
// region in a container, normally in a file content.
// shared_ptr points the resource that the iter points.
// combinators returns this.
@@ -86,12 +48,89 @@ struct region_base
virtual std::string line() const {return std::string("unknown line");}
virtual std::string line_num() const {return std::string("?");}
virtual std::size_t before() const noexcept {return 0;}
virtual std::size_t size() const noexcept {return 0;}
virtual std::size_t after() const noexcept {return 0;}
};
// location in a container, normally in a file content.
// shared_ptr points the resource that the iter points.
// it can be used not only for resource handling, but also error message.
//
// it can be considered as a region that contains only one character.
template<typename Container>
struct location final : public region_base
{
static_assert(std::is_same<char, typename Container::value_type>::value,"");
using const_iterator = typename Container::const_iterator;
using source_ptr = std::shared_ptr<const Container>;
location(std::string name, Container cont)
: source_(std::make_shared<Container>(std::move(cont))),
source_name_(std::move(name)), iter_(source_->cbegin())
{}
location(const location&) = default;
location(location&&) = default;
location& operator=(const location&) = default;
location& operator=(location&&) = default;
~location() = default;
bool is_ok() const noexcept override {return static_cast<bool>(source_);}
const_iterator& iter() noexcept {return iter_;}
const_iterator iter() const noexcept {return iter_;}
const_iterator begin() const noexcept {return source_->cbegin();}
const_iterator end() const noexcept {return source_->cend();}
std::string str() const override {return make_string(1, *this->iter());}
std::string name() const override {return source_name_;}
std::string line_num() const override
{
return std::to_string(1+std::count(this->begin(), this->iter(), '\n'));
}
std::string line() const override
{
return make_string(this->line_begin(), this->line_end());
}
const_iterator line_begin() const noexcept
{
using reverse_iterator = std::reverse_iterator<const_iterator>;
return std::find(reverse_iterator(this->iter()),
reverse_iterator(this->begin()), '\n').base();
}
const_iterator line_end() const noexcept
{
return std::find(this->iter(), this->end(), '\n');
}
// location is always points a character. so the size is 1.
std::size_t size() const noexcept override
{
return 1u;
}
std::size_t before() const noexcept override
{
return std::distance(this->line_begin(), this->iter());
}
std::size_t after() const noexcept override
{
return std::distance(this->iter(), this->line_end());
}
source_ptr const& source() const& noexcept {return source_;}
source_ptr&& source() && noexcept {return std::move(source_);}
private:
source_ptr source_;
std::string source_name_;
const_iterator iter_;
};
template<typename Container>
struct region final : public region_base
{
@@ -225,7 +264,19 @@ inline std::string format_underline(const std::string& message,
retval += make_string(line_number.size() + 1, ' ');
retval += " | ";
retval += make_string(reg.before(), ' ');
retval += make_string(reg.size(), '~');
if(reg.size() == 1)
{
// invalid
// ^------
retval += '^';
retval += make_string(reg.after(), '-');
}
else
{
// invalid
// ~~~~~~~
retval += make_string(reg.size(), '~');
}
retval += ' ';
retval += comment_for_underline;
if(helps.size() != 0)
@@ -270,7 +321,19 @@ inline std::string format_underline(const std::string& message,
retval << make_string(line_num_width + 1, ' ');
retval << " | ";
retval << make_string(reg1.before(), ' ');
retval << make_string(reg1.size(), '~');
if(reg1.size() == 1)
{
// invalid
// ^------
retval << '^';
retval << make_string(reg1.after(), '-');
}
else
{
// invalid
// ~~~~~~~
retval << make_string(reg1.size(), '~');
}
retval << ' ';
retval << comment_for_underline1 << newline;
// ---------------------------------------
@@ -287,7 +350,19 @@ inline std::string format_underline(const std::string& message,
retval << make_string(line_num_width + 1, ' ');
retval << " | ";
retval << make_string(reg2.before(), ' ');
retval << make_string(reg2.size(), '~');
if(reg2.size() == 1)
{
// invalid
// ^------
retval << '^';
retval << make_string(reg2.after(), '-');
}
else
{
// invalid
// ~~~~~~~
retval << make_string(reg2.size(), '~');
}
retval << ' ';
retval << comment_for_underline2;
if(helps.size() != 0)
@@ -305,62 +380,84 @@ inline std::string format_underline(const std::string& message,
return retval.str();
}
// to show a better error message.
template<typename Container>
std::string
format_underline(const std::string& message, const location<Container>& loc,
const std::string& comment_for_underline,
std::vector<std::string> helps = {})
inline std::string format_underline(const std::string& message,
std::vector<std::pair<region_base const*, std::string>> reg_com,
std::vector<std::string> helps = {})
{
assert(!reg_com.empty());
#ifdef _WIN32
const auto newline = "\r\n";
#else
const char newline = '\n';
#endif
using const_iterator = typename location<Container>::const_iterator;
using reverse_iterator = std::reverse_iterator<const_iterator>;
const auto line_begin = std::find(reverse_iterator(loc.iter()),
reverse_iterator(loc.begin()),
'\n').base();
const auto line_end = std::find(loc.iter(), loc.end(), '\n');
const auto line_number = std::to_string(
1 + std::count(loc.begin(), loc.iter(), '\n'));
const auto line_num_width = std::max_element(reg_com.begin(), reg_com.end(),
[](std::pair<region_base const*, std::string> const& lhs,
std::pair<region_base const*, std::string> const& rhs)
{
return lhs.first->line_num().size() < rhs.first->line_num().size();
}
)->first->line_num().size();
std::ostringstream retval;
retval << message << newline;
for(std::size_t i=0; i<reg_com.size(); ++i)
{
if(i!=0 && reg_com.at(i-1).first->name() == reg_com.at(i).first->name())
{
retval << " ..." << newline;
}
else
{
retval << " --> " << reg_com.at(i).first->name() << newline;
}
const region_base* const reg = reg_com.at(i).first;
const std::string& comment = reg_com.at(i).second;
retval << ' ' << std::setw(line_num_width) << reg->line_num();
retval << " | " << reg->line() << newline;
retval << make_string(line_num_width + 1, ' ');
retval << " | " << make_string(reg->before(), ' ');
if(reg->size() == 1)
{
// invalid
// ^------
retval << '^';
retval << make_string(reg->after(), '-');
}
else
{
// invalid
// ~~~~~~~
retval << make_string(reg->size(), '~');
}
retval << ' ';
retval << comment << newline;
}
std::string retval;
retval += message;
retval += newline;
retval += " --> ";
retval += loc.name();
retval += newline;
retval += ' ';
retval += line_number;
retval += " | ";
retval += make_string(line_begin, line_end);
retval += newline;
retval += make_string(line_number.size() + 1, ' ');
retval += " | ";
retval += make_string(std::distance(line_begin, loc.iter()),' ');
retval += '^';
retval += make_string(std::distance(loc.iter(), line_end), '-');
retval += ' ';
retval += comment_for_underline;
if(helps.size() != 0)
{
retval += newline;
retval += make_string(line_number.size() + 1, ' ');
retval += " | ";
retval << newline;
retval << make_string(line_num_width + 1, ' ');
retval << " | ";
for(const auto help : helps)
{
retval += newline;
retval += "Hint: ";
retval += help;
retval << newline;
retval << "Hint: ";
retval << help;
}
}
return retval;
return retval.str();
}
} // detail
} // toml
#endif// TOML11_REGION_H

View File

@@ -29,8 +29,9 @@ inline void resize_impl(T& container, std::size_t N, std::true_type)
template<typename T>
inline void resize_impl(T& container, std::size_t N, std::false_type)
{
if(container.size() >= N) return;
else throw std::invalid_argument("not resizable type");
if(container.size() >= N) {return;}
throw std::invalid_argument("not resizable type");
}
} // detail
@@ -38,8 +39,9 @@ inline void resize_impl(T& container, std::size_t N, std::false_type)
template<typename T>
inline void resize(T& container, std::size_t N)
{
if(container.size() == N) return;
else return detail::resize_impl(container, N, detail::has_resize_method<T>());
if(container.size() == N) {return;}
return detail::resize_impl(container, N, detail::has_resize_method<T>());
}
namespace detail
@@ -73,7 +75,5 @@ T from_string(const std::string& str, U&& opt)
return v;
}
}// toml
#endif // TOML11_UTILITY

View File

@@ -21,8 +21,8 @@ namespace detail
{
// to show error messages. not recommended for users.
region_base const& get_region(const value&);
// ditto.
void assign_keeping_region(value&, value);
template<typename Region>
void change_region(value&, Region&&);
}// detail
template<typename T>
@@ -562,8 +562,8 @@ class value
// for error messages
friend region_base const& detail::get_region(const value&);
// to see why it's here, see detail::insert_nested_key.
friend void detail::assign_keeping_region(value&, value);
template<typename Region>
friend void detail::change_region(value&, Region&&);
template<value_t T>
struct switch_cast;
@@ -599,35 +599,21 @@ inline region_base const& get_region(const value& v)
{
return *(v.region_info_);
}
// If we keep region information after assigning another toml::* types, the
// error message become different from the actual value contained.
// To avoid this kind of confusing phenomena, the default assigners clear the
// old region_info_. But this functionality is actually needed deep inside of
// parser, so if you want to see the usecase, see toml::detail::insert_nested_key
// defined in toml/parser.hpp.
inline void assign_keeping_region(value& v, value other)
template<typename Region>
void change_region(value& v, Region&& reg)
{
v.cleanup(); // this keeps region info
// keep region_info_ intact
v.type_ = other.type();
switch(v.type())
{
case value_t::Boolean : ::toml::value::assigner(v.boolean_ , other.boolean_ ); break;
case value_t::Integer : ::toml::value::assigner(v.integer_ , other.integer_ ); break;
case value_t::Float : ::toml::value::assigner(v.floating_ , other.floating_ ); break;
case value_t::String : ::toml::value::assigner(v.string_ , other.string_ ); break;
case value_t::OffsetDatetime: ::toml::value::assigner(v.offset_datetime_, other.offset_datetime_); break;
case value_t::LocalDatetime : ::toml::value::assigner(v.local_datetime_ , other.local_datetime_ ); break;
case value_t::LocalDate : ::toml::value::assigner(v.local_date_ , other.local_date_ ); break;
case value_t::LocalTime : ::toml::value::assigner(v.local_time_ , other.local_time_ ); break;
case value_t::Array : ::toml::value::assigner(v.array_ , other.array_ ); break;
case value_t::Table : ::toml::value::assigner(v.table_ , other.table_ ); break;
default: break;
}
using region_type = typename std::remove_reference<
typename std::remove_cv<Region>::type
>::type;
std::shared_ptr<region_base> new_reg =
std::make_shared<region_type>(std::forward<region_type>(reg));
v.region_info_ = new_reg;
return;
}
}// detail
}// detail
template<> struct value::switch_cast<value_t::Boolean>
{