Compare commits

..

21 Commits

Author SHA1 Message Date
ToruNiina
46be054ce9 fix: improve err msg for multiline inline table
show "missing curly brace" instead of "missing table key-value separator"
2019-04-19 13:22:13 +09:00
ToruNiina
789d784769 chore: update README; about literals 2019-04-19 13:18:35 +09:00
ToruNiina
81deb8efde chore: update README 2019-04-19 12:41:24 +09:00
Toru Niina
072dccd05d Merge pull request #56 from ToruNiina/optimization
Optimization
2019-04-19 01:30:29 +09:00
ToruNiina
637c99d637 refactor: generate error message in parser 2019-04-18 15:09:58 +09:00
ToruNiina
0f48852730 perf: check value type before parsing
to avoid needless error message generation
2019-04-18 14:26:27 +09:00
ToruNiina
0499b2907d Merge branch 'master' into optimization 2019-04-18 14:10:08 +09:00
ToruNiina
61e69c9251 fix: count line number from 1, not 0 2019-04-18 13:56:19 +09:00
ToruNiina
4a560ea1e5 fix: show correct error message 2019-04-18 00:04:33 +09:00
ToruNiina
c5b6ee6f81 feat: add yet another constructor to value
to make implementation of parse_value easier
2019-04-17 23:43:42 +09:00
ToruNiina
1a7bf63622 Merge branch 'master' into optimization 2019-04-17 14:58:28 +09:00
Toru Niina
8847cdc0a9 Merge pull request #55 from wbenny/master
fix /W4 warnings on MSVC
2019-04-17 13:16:19 +09:00
ToruNiina
c82e76a111 perf: check string type before parsing it
to avoid unncessary error message generation, check the first some
characters before parsing it. It makes parsing process faster and
is also helpful to generate more accurate error messages.
2019-04-16 21:47:24 +09:00
ToruNiina
4db486d76d perf: check integer prefix before trying to parse
all the parsers generate error messages and error message generation is
not a lightweight task. It concatenates a lot of strings, it formats
many values, etc. To avoid useless error-message generation, first check
which prefix is used and then parse special integers. Additionally, by
checking that, the quality of the error message can be improved (later).
2019-04-16 21:37:12 +09:00
ToruNiina
91966a6917 perf: do not use concat_string if it is not needed
At the earlier stage of the development, I thought that it is useful if
lexer-combinators generate error messages, because by doing this,
parser would not need to generate an error message. But now it turned
out that to show an appropriate error message, parser need to generate
according to the context. And almost all the messages from lexer are
discarded. So I added another parameter to lexer-combinator to suppress
error message generation. In the future, we may want to remove messages
completely from lexers, but currently I will keep it. Removing those
unused message generation makes the parsing process faster.
2019-04-16 21:09:59 +09:00
ToruNiina
b3917aaadf refactor: use snprintf to show char in hex
instead of std::ostringstream.
2019-04-16 20:54:29 +09:00
Petr Benes
ba307003c4 fix /W4 warnings on MSVC 2019-04-16 13:25:45 +02:00
Toru Niina
21fd1271d9 Merge pull request #54 from ToruNiina/hotfix
fix: resolve ambiguity in the `""_toml` literal
2019-04-15 13:34:35 +09:00
ToruNiina
f9ab7d6f56 chore: add note about literals to README.md 2019-04-14 20:08:23 +09:00
ToruNiina
0a3a41a708 test: add test for literals for difficult case 2019-04-14 20:06:11 +09:00
ToruNiina
6c2a536fa5 fix: check literal has a table or an array first
The literal like this `"[[table]]"_toml` caused a syntax error. It is
because the literal parser first check that it might be a bare value
without a key, and parse_array directory throws syntax_error. This
change makes the parser first check a literal is a name of table, and
then parse the content.
2019-04-14 19:48:43 +09:00
8 changed files with 313 additions and 104 deletions

View File

@@ -603,6 +603,27 @@ toml::value operator""_toml(const char* str, std::size_t len);
Access to the operator can be gained with `using namespace toml::literals;`, Access to the operator can be gained with `using namespace toml::literals;`,
`using namespace toml::toml_literals`, and `using namespace toml::literals::toml_literals`. `using namespace toml::toml_literals`, and `using namespace toml::literals::toml_literals`.
Note that a key that is composed only of digits is allowed in TOML.
And, unlike the file parser, toml-literal allows a bare value without a key.
Thus it is difficult to distinguish arrays having integers and definitions of
tables that are named as digits.
Currently, literal `[1]` becomes a table named "1".
To ensure a literal to be considered as an array with one element, you need to
add a comma after the first element (like `[1,]`).
```cpp
"[1,2,3]"_toml; // This is an array
"[table]"_toml; // This is a table that has an empty table named "table" inside.
"[[1,2,3]]"_toml; // This is an array of arrays
"[[table]]"_toml; // This is a table that has an array of tables inside.
"[[1]]"_toml; // This literal is ambiguous.
// Currently, it becomes a table taht has array of table "1".
"1 = [{}]"_toml; // This is a table that has an array of table named 1.
"[[1,]]"_toml; // This is an array of arrays.
"[[1],]"_toml; // ditto.
```
## Conversion between toml value and arbitrary types ## Conversion between toml value and arbitrary types
You can also use `toml::get` and other related functions with the types you defined You can also use `toml::get` and other related functions with the types you defined
@@ -961,11 +982,13 @@ I appreciate the help of the contributors who introduced the great feature to th
- Quentin Khan (@xaxousis) - Quentin Khan (@xaxousis)
- Found & Fixed a bug around ODR - Found & Fixed a bug around ODR
- Improved error messages for invaild keys to show the location where the parser fails - Improved error messages for invaild keys to show the location where the parser fails
- Petr Beneš (@wbenny)
- Fixed warnings on MSVC
## Licensing terms ## Licensing terms
This product is licensed under the terms of the [MIT License](LICENSE). This product is licensed under the terms of the [MIT License](LICENSE).
- Copyright (c) 2017 Toru Niina - Copyright (c) 2017-2019 Toru Niina
All rights reserved. All rights reserved.

View File

@@ -33,6 +33,18 @@ BOOST_AUTO_TEST_CASE(test_file_as_literal)
b = "baz" b = "baz"
)"_toml; )"_toml;
BOOST_CHECK_EQUAL(r, v);
}
{
const toml::value r{
{"table", toml::table{{"a", 42}, {"b", "baz"}}}
};
const toml::value v = u8R"(
[table]
a = 42
b = "baz"
)"_toml;
BOOST_CHECK_EQUAL(r, v); BOOST_CHECK_EQUAL(r, v);
} }
} }
@@ -91,6 +103,19 @@ BOOST_AUTO_TEST_CASE(test_value_as_literal)
BOOST_CHECK(v1.is_array()); BOOST_CHECK(v1.is_array());
BOOST_CHECK((toml::get<std::vector<int>>(v1) == std::vector<int>{1,2,3})); BOOST_CHECK((toml::get<std::vector<int>>(v1) == std::vector<int>{1,2,3}));
const toml::value v2 = u8R"([1,])"_toml;
BOOST_CHECK(v2.is_array());
BOOST_CHECK((toml::get<std::vector<int>>(v2) == std::vector<int>{1}));
const toml::value v3 = u8R"([[1,]])"_toml;
BOOST_CHECK(v3.is_array());
BOOST_CHECK((toml::get<std::vector<int>>(toml::get<toml::array>(v3).front()) == std::vector<int>{1}));
const toml::value v4 = u8R"([[1],])"_toml;
BOOST_CHECK(v4.is_array());
BOOST_CHECK((toml::get<std::vector<int>>(toml::get<toml::array>(v4).front()) == std::vector<int>{1}));
} }
{ {
const toml::value v1 = u8R"({a = 42})"_toml; const toml::value v1 = u8R"({a = 42})"_toml;

View File

@@ -9,7 +9,10 @@
#include <type_traits> #include <type_traits>
#include <iterator> #include <iterator>
#include <limits> #include <limits>
#include <array>
#include <iomanip> #include <iomanip>
#include <cstdio>
#include <cassert>
#include <cctype> #include <cctype>
// they scans characters and returns region if it matches to the condition. // they scans characters and returns region if it matches to the condition.
@@ -38,10 +41,12 @@ inline std::string show_char(const char c)
} }
else else
{ {
std::ostringstream oss; std::array<char, 5> buf;
oss << "0x" << std::hex << std::setfill('0') << std::setw(2) buf.fill('\0');
<< static_cast<int>(c); const auto r = std::snprintf(
return oss.str(); buf.data(), buf.size(), "0x%02x", static_cast<int>(c) & 0xFF);
assert(r == buf.size() - 1);
return std::string(buf.data());
} }
} }
@@ -51,7 +56,8 @@ struct character
static constexpr char target = C; static constexpr char target = C;
template<typename Cont> template<typename Cont>
static result<region<Cont>, std::string> invoke(location<Cont>& loc) static result<region<Cont>, std::string>
invoke(location<Cont>& loc, const bool msg = false)
{ {
static_assert(std::is_same<char, typename Cont::value_type>::value, static_assert(std::is_same<char, typename Cont::value_type>::value,
"internal error: container::value_type should be `char`."); "internal error: container::value_type should be `char`.");
@@ -61,10 +67,14 @@ struct character
const char c = *(loc.iter()); const char c = *(loc.iter());
if(c != target) if(c != target)
{
if(msg)
{ {
return err(concat_to_string("expected '", show_char(target), return err(concat_to_string("expected '", show_char(target),
"' but got '", show_char(c), "'.")); "' but got '", show_char(c), "'."));
} }
return err("");
}
loc.advance(); // update location loc.advance(); // update location
return ok(region<Cont>(loc, first, loc.iter())); return ok(region<Cont>(loc, first, loc.iter()));
@@ -86,7 +96,8 @@ struct in_range
static constexpr char lower = Low; static constexpr char lower = Low;
template<typename Cont> template<typename Cont>
static result<region<Cont>, std::string> invoke(location<Cont>& loc) static result<region<Cont>, std::string>
invoke(location<Cont>& loc, const bool msg = false)
{ {
static_assert(std::is_same<char, typename Cont::value_type>::value, static_assert(std::is_same<char, typename Cont::value_type>::value,
"internal error: container::value_type should be `char`."); "internal error: container::value_type should be `char`.");
@@ -96,11 +107,15 @@ struct in_range
const char c = *(loc.iter()); const char c = *(loc.iter());
if(c < lower || upper < c) if(c < lower || upper < c)
{
if(msg)
{ {
return err(concat_to_string("expected character in range " return err(concat_to_string("expected character in range "
"[", show_char(lower), ", ", show_char(upper), "] but got ", "[", show_char(lower), ", ", show_char(upper), "] but got ",
"'", show_char(c), "'.")); "'", show_char(c), "'."));
} }
return err("");
}
loc.advance(); loc.advance();
return ok(region<Cont>(loc, first, loc.iter())); return ok(region<Cont>(loc, first, loc.iter()));
@@ -120,7 +135,8 @@ template<typename Combinator>
struct exclude struct exclude
{ {
template<typename Cont> template<typename Cont>
static result<region<Cont>, std::string> invoke(location<Cont>& loc) static result<region<Cont>, std::string>
invoke(location<Cont>& loc, const bool msg = false)
{ {
static_assert(std::is_same<char, typename Cont::value_type>::value, static_assert(std::is_same<char, typename Cont::value_type>::value,
"internal error: container::value_type should be `char`."); "internal error: container::value_type should be `char`.");
@@ -128,13 +144,16 @@ struct exclude
if(loc.iter() == loc.end()) {return err("not sufficient characters");} if(loc.iter() == loc.end()) {return err("not sufficient characters");}
auto first = loc.iter(); auto first = loc.iter();
auto rslt = Combinator::invoke(loc); auto rslt = Combinator::invoke(loc, msg);
if(rslt.is_ok()) if(rslt.is_ok())
{ {
loc.reset(first); loc.reset(first);
return err(concat_to_string( if(msg)
"invalid pattern (", Combinator::pattern(), ") appeared ", {
rslt.unwrap().str())); return err(concat_to_string("invalid pattern (",
Combinator::pattern(), ") appeared ", rslt.unwrap().str()));
}
return err("");
} }
loc.reset(std::next(first)); // XXX maybe loc.advance() is okay but... loc.reset(std::next(first)); // XXX maybe loc.advance() is okay but...
return ok(region<Cont>(loc, first, loc.iter())); return ok(region<Cont>(loc, first, loc.iter()));
@@ -151,12 +170,13 @@ template<typename Combinator>
struct maybe struct maybe
{ {
template<typename Cont> template<typename Cont>
static result<region<Cont>, std::string> invoke(location<Cont>& loc) static result<region<Cont>, std::string>
invoke(location<Cont>& loc, const bool msg = false)
{ {
static_assert(std::is_same<char, typename Cont::value_type>::value, static_assert(std::is_same<char, typename Cont::value_type>::value,
"internal error: container::value_type should be `char`."); "internal error: container::value_type should be `char`.");
const auto rslt = Combinator::invoke(loc); const auto rslt = Combinator::invoke(loc, msg);
if(rslt.is_ok()) if(rslt.is_ok())
{ {
return rslt; return rslt;
@@ -177,34 +197,36 @@ template<typename Head, typename ... Tail>
struct sequence<Head, Tail...> struct sequence<Head, Tail...>
{ {
template<typename Cont> template<typename Cont>
static result<region<Cont>, std::string> invoke(location<Cont>& loc) static result<region<Cont>, std::string>
invoke(location<Cont>& loc, const bool msg = false)
{ {
static_assert(std::is_same<char, typename Cont::value_type>::value, static_assert(std::is_same<char, typename Cont::value_type>::value,
"internal error: container::value_type should be `char`."); "internal error: container::value_type should be `char`.");
const auto first = loc.iter(); const auto first = loc.iter();
const auto rslt = Head::invoke(loc); const auto rslt = Head::invoke(loc, msg);
if(rslt.is_err()) if(rslt.is_err())
{ {
loc.reset(first); loc.reset(first);
return err(rslt.unwrap_err()); return err(rslt.unwrap_err());
} }
return sequence<Tail...>::invoke(loc, std::move(rslt.unwrap()), first); return sequence<Tail...>::invoke(loc, std::move(rslt.unwrap()), first, msg);
} }
// called from the above function only, recursively. // called from the above function only, recursively.
template<typename Cont, typename Iterator> template<typename Cont, typename Iterator>
static result<region<Cont>, std::string> static result<region<Cont>, std::string>
invoke(location<Cont>& loc, region<Cont> reg, Iterator first) invoke(location<Cont>& loc, region<Cont> reg, Iterator first,
const bool msg = false)
{ {
const auto rslt = Head::invoke(loc); const auto rslt = Head::invoke(loc, msg);
if(rslt.is_err()) if(rslt.is_err())
{ {
loc.reset(first); loc.reset(first);
return err(rslt.unwrap_err()); return err(rslt.unwrap_err());
} }
reg += rslt.unwrap(); // concat regions reg += rslt.unwrap(); // concat regions
return sequence<Tail...>::invoke(loc, std::move(reg), first); return sequence<Tail...>::invoke(loc, std::move(reg), first, msg);
} }
static std::string pattern() static std::string pattern()
@@ -219,9 +241,10 @@ struct sequence<Head>
// would be called from sequence<T ...>::invoke only. // would be called from sequence<T ...>::invoke only.
template<typename Cont, typename Iterator> template<typename Cont, typename Iterator>
static result<region<Cont>, std::string> static result<region<Cont>, std::string>
invoke(location<Cont>& loc, region<Cont> reg, Iterator first) invoke(location<Cont>& loc, region<Cont> reg, Iterator first,
const bool msg = false)
{ {
const auto rslt = Head::invoke(loc); const auto rslt = Head::invoke(loc, msg);
if(rslt.is_err()) if(rslt.is_err())
{ {
loc.reset(first); loc.reset(first);
@@ -240,14 +263,15 @@ template<typename Head, typename ... Tail>
struct either<Head, Tail...> struct either<Head, Tail...>
{ {
template<typename Cont> template<typename Cont>
static result<region<Cont>, std::string> invoke(location<Cont>& loc) static result<region<Cont>, std::string>
invoke(location<Cont>& loc, const bool msg = false)
{ {
static_assert(std::is_same<char, typename Cont::value_type>::value, static_assert(std::is_same<char, typename Cont::value_type>::value,
"internal error: container::value_type should be `char`."); "internal error: container::value_type should be `char`.");
const auto rslt = Head::invoke(loc); const auto rslt = Head::invoke(loc, msg);
if(rslt.is_ok()) {return rslt;} if(rslt.is_ok()) {return rslt;}
return either<Tail...>::invoke(loc); return either<Tail...>::invoke(loc, msg);
} }
static std::string pattern() static std::string pattern()
@@ -259,11 +283,12 @@ template<typename Head>
struct either<Head> struct either<Head>
{ {
template<typename Cont> template<typename Cont>
static result<region<Cont>, std::string> invoke(location<Cont>& loc) static result<region<Cont>, std::string>
invoke(location<Cont>& loc, const bool msg = false)
{ {
static_assert(std::is_same<char, typename Cont::value_type>::value, static_assert(std::is_same<char, typename Cont::value_type>::value,
"internal error: container::value_type should be `char`."); "internal error: container::value_type should be `char`.");
return Head::invoke(loc); return Head::invoke(loc, msg);
} }
static std::string pattern() static std::string pattern()
{ {
@@ -282,13 +307,14 @@ template<typename T, std::size_t N>
struct repeat<T, exactly<N>> struct repeat<T, exactly<N>>
{ {
template<typename Cont> template<typename Cont>
static result<region<Cont>, std::string> invoke(location<Cont>& loc) static result<region<Cont>, std::string>
invoke(location<Cont>& loc, const bool msg = false)
{ {
region<Cont> retval(loc); region<Cont> retval(loc);
const auto first = loc.iter(); const auto first = loc.iter();
for(std::size_t i=0; i<N; ++i) for(std::size_t i=0; i<N; ++i)
{ {
auto rslt = T::invoke(loc); auto rslt = T::invoke(loc, msg);
if(rslt.is_err()) if(rslt.is_err())
{ {
loc.reset(first); loc.reset(first);
@@ -308,14 +334,15 @@ template<typename T, std::size_t N>
struct repeat<T, at_least<N>> struct repeat<T, at_least<N>>
{ {
template<typename Cont> template<typename Cont>
static result<region<Cont>, std::string> invoke(location<Cont>& loc) static result<region<Cont>, std::string>
invoke(location<Cont>& loc, const bool msg = false)
{ {
region<Cont> retval(loc); region<Cont> retval(loc);
const auto first = loc.iter(); const auto first = loc.iter();
for(std::size_t i=0; i<N; ++i) for(std::size_t i=0; i<N; ++i)
{ {
auto rslt = T::invoke(loc); auto rslt = T::invoke(loc, msg);
if(rslt.is_err()) if(rslt.is_err())
{ {
loc.reset(first); loc.reset(first);
@@ -325,7 +352,7 @@ struct repeat<T, at_least<N>>
} }
while(true) while(true)
{ {
auto rslt = T::invoke(loc); auto rslt = T::invoke(loc, msg);
if(rslt.is_err()) if(rslt.is_err())
{ {
return ok(std::move(retval)); return ok(std::move(retval));
@@ -343,12 +370,13 @@ template<typename T>
struct repeat<T, unlimited> struct repeat<T, unlimited>
{ {
template<typename Cont> template<typename Cont>
static result<region<Cont>, std::string> invoke(location<Cont>& loc) static result<region<Cont>, std::string>
invoke(location<Cont>& loc, const bool msg = false)
{ {
region<Cont> retval(loc); region<Cont> retval(loc);
while(true) while(true)
{ {
auto rslt = T::invoke(loc); auto rslt = T::invoke(loc, msg);
if(rslt.is_err()) if(rslt.is_err())
{ {
return ok(std::move(retval)); return ok(std::move(retval));

View File

@@ -25,16 +25,18 @@ inline std::tm localtime_s(const std::time_t* src)
{ {
std::tm dst; std::tm dst;
const auto result = ::localtime_r(src, &dst); const auto result = ::localtime_r(src, &dst);
if(!result) if (!result) { throw std::runtime_error("localtime_r failed."); }
{ return dst;
throw std::runtime_error("localtime_r failed.");
} }
#elif _MSC_VER
inline std::tm localtime_s(const std::time_t* src)
{
std::tm dst;
const auto result = ::localtime_s(&dst, src);
if (result) { throw std::runtime_error("localtime_s failed."); }
return dst; return dst;
} }
#else #else
// XXX: On Windows, std::localtime is thread-safe because they uses thread-local
// storage to store the instance of std::tm. On the other platforms, it may not
// be thread-safe.
inline std::tm localtime_s(const std::time_t* src) inline std::tm localtime_s(const std::time_t* src)
{ {
const auto result = std::localtime(src); const auto result = std::localtime(src);
@@ -360,12 +362,12 @@ struct local_datetime
// can be used to get millisecond & microsecond information. // can be used to get millisecond & microsecond information.
const auto t_diff = tp - const auto t_diff = tp -
std::chrono::system_clock::from_time_t(std::mktime(&time)); std::chrono::system_clock::from_time_t(std::mktime(&time));
this->time.millisecond = std::chrono::duration_cast< this->time.millisecond = static_cast<std::uint16_t>(
std::chrono::milliseconds>(t_diff).count(); std::chrono::duration_cast<std::chrono::milliseconds>(t_diff).count());
this->time.microsecond = std::chrono::duration_cast< this->time.microsecond = static_cast<std::uint16_t>(
std::chrono::microseconds>(t_diff).count(); std::chrono::duration_cast<std::chrono::microseconds>(t_diff).count());
this->time.nanosecond = std::chrono::duration_cast< this->time.nanosecond = static_cast<std::uint16_t>(
std::chrono::nanoseconds >(t_diff).count(); std::chrono::duration_cast<std::chrono::nanoseconds >(t_diff).count());
} }
explicit local_datetime(const std::time_t t) explicit local_datetime(const std::time_t t)

View File

@@ -30,13 +30,46 @@ inline ::toml::value operator""_toml(const char* str, std::size_t len)
::toml::detail::lex_ws, ::toml::detail::at_least<1>>; ::toml::detail::lex_ws, ::toml::detail::at_least<1>>;
skip_ws::invoke(loc); skip_ws::invoke(loc);
// literal may be a bare value. try them first. // to distinguish arrays and tables, first check it is a table or not.
//
// "[1,2,3]"_toml; // this is an array
// "[table]"_toml; // a table that has an empty table named "table" inside.
// "[[1,2,3]]"_toml; // this is an array of arrays
// "[[table]]"_toml; // this is a table that has an array of tables inside.
//
// "[[1]]"_toml; // this can be both... (currently it becomes a table)
// "1 = [{}]"_toml; // this is a table that has an array of table named 1.
// "[[1,]]"_toml; // this is an array of arrays.
// "[[1],]"_toml; // this also.
const auto the_front = loc.iter();
const bool is_table_key = ::toml::detail::lex_std_table::invoke(loc);
loc.reset(the_front);
const bool is_aots_key = ::toml::detail::lex_array_table::invoke(loc);
loc.reset(the_front);
// If it is neither a table-key or a array-of-table-key, it may be a value.
if(!is_table_key && !is_aots_key)
{
if(auto data = ::toml::detail::parse_value(loc)) if(auto data = ::toml::detail::parse_value(loc))
{ {
return data.unwrap(); return data.unwrap();
} }
}
// Note that still it can be a table, because the literal might be something
// like the following.
// ```cpp
// R"( // c++11 raw string literals
// key = "value"
// int = 42
// )"_toml;
// ```
// It is a valid toml file.
// It should be parsed as if we parse a file with this content.
// literal is a TOML file (i.e. multiline table).
if(auto data = ::toml::detail::parse_toml_file(loc)) if(auto data = ::toml::detail::parse_toml_file(loc))
{ {
loc.reset(loc.begin()); // rollback to the top of the literal loc.reset(loc.begin()); // rollback to the top of the literal

View File

@@ -116,10 +116,28 @@ parse_integer(location<Container>& loc)
const auto first = loc.iter(); const auto first = loc.iter();
if(first != loc.end() && *first == '0') if(first != loc.end() && *first == '0')
{ {
if(const auto bin = parse_binary_integer (loc)) {return bin;} const auto second = std::next(first);
if(const auto oct = parse_octal_integer (loc)) {return oct;} if(second == loc.end()) // the token is just zero.
if(const auto hex = parse_hexadecimal_integer(loc)) {return hex;} {
// else, maybe just zero. return ok(std::make_pair(0, region<Container>(loc, first, second)));
}
if(*second == 'b') {return parse_binary_integer (loc);} // 0b1100
if(*second == 'o') {return parse_octal_integer (loc);} // 0o775
if(*second == 'x') {return parse_hexadecimal_integer(loc);} // 0xC0FFEE
if(std::isdigit(*second))
{
return err(format_underline("[error] toml::parse_integer: "
"leading zero in an Integer is not allowed.",
{{std::addressof(loc), "leading zero"}}));
}
else if(std::isalpha(*second))
{
return err(format_underline("[error] toml::parse_integer: "
"unknown integer prefix appeared.",
{{std::addressof(loc), "none of 0x, 0o, 0b"}}));
}
} }
if(const auto token = lex_dec_int::invoke(loc)) if(const auto token = lex_dec_int::invoke(loc))
@@ -308,7 +326,7 @@ result<std::string, std::string> parse_escape_sequence(location<Container>& loc)
{ {
return err(format_underline("[error] parse_escape_sequence: " return err(format_underline("[error] parse_escape_sequence: "
"invalid token found in UTF-8 codepoint uXXXX.", "invalid token found in UTF-8 codepoint uXXXX.",
{{std::addressof(loc), token.unwrap_err()}})); {{std::addressof(loc), "here"}}));
} }
} }
case 'U': case 'U':
@@ -321,7 +339,7 @@ result<std::string, std::string> parse_escape_sequence(location<Container>& loc)
{ {
return err(format_underline("[error] parse_escape_sequence: " return err(format_underline("[error] parse_escape_sequence: "
"invalid token found in UTF-8 codepoint Uxxxxxxxx", "invalid token found in UTF-8 codepoint Uxxxxxxxx",
{{std::addressof(loc), token.unwrap_err()}})); {{std::addressof(loc), "here"}}));
} }
} }
} }
@@ -388,7 +406,9 @@ parse_ml_basic_string(location<Container>& loc)
else else
{ {
loc.reset(first); loc.reset(first);
return err(token.unwrap_err()); return err(format_underline("[error] toml::parse_ml_basic_string: "
"the next token is not a multiline string",
{{std::addressof(loc), "here"}}));
} }
} }
@@ -437,7 +457,9 @@ parse_basic_string(location<Container>& loc)
else else
{ {
loc.reset(first); // rollback loc.reset(first); // rollback
return err(token.unwrap_err()); return err(format_underline("[error] toml::parse_basic_string: "
"the next token is not a string",
{{std::addressof(loc), "here"}}));
} }
} }
@@ -476,7 +498,9 @@ parse_ml_literal_string(location<Container>& loc)
else else
{ {
loc.reset(first); // rollback loc.reset(first); // rollback
return err(token.unwrap_err()); return err(format_underline("[error] toml::parse_ml_literal_string: "
"the next token is not a multiline literal string",
{{std::addressof(loc), "here"}}));
} }
} }
@@ -513,7 +537,9 @@ parse_literal_string(location<Container>& loc)
else else
{ {
loc.reset(first); // rollback loc.reset(first); // rollback
return err(token.unwrap_err()); return err(format_underline("[error] toml::parse_literal_string: "
"the next token is not a literal string",
{{std::addressof(loc), "here"}}));
} }
} }
@@ -521,10 +547,30 @@ template<typename Container>
result<std::pair<toml::string, region<Container>>, std::string> result<std::pair<toml::string, region<Container>>, std::string>
parse_string(location<Container>& loc) parse_string(location<Container>& loc)
{ {
if(const auto rslt = parse_ml_basic_string(loc)) {return rslt;} if(loc.iter() != loc.end() && *(loc.iter()) == '"')
if(const auto rslt = parse_ml_literal_string(loc)) {return rslt;} {
if(const auto rslt = parse_basic_string(loc)) {return rslt;} if(loc.iter() + 1 != loc.end() && *(loc.iter() + 1) == '"' &&
if(const auto rslt = parse_literal_string(loc)) {return rslt;} loc.iter() + 2 != loc.end() && *(loc.iter() + 2) == '"')
{
return parse_ml_basic_string(loc);
}
else
{
return parse_basic_string(loc);
}
}
else if(loc.iter() != loc.end() && *(loc.iter()) == '\'')
{
if(loc.iter() + 1 != loc.end() && *(loc.iter() + 1) == '\'' &&
loc.iter() + 2 != loc.end() && *(loc.iter() + 2) == '\'')
{
return parse_ml_literal_string(loc);
}
else
{
return parse_literal_string(loc);
}
}
return err(format_underline("[error] toml::parse_string: ", return err(format_underline("[error] toml::parse_string: ",
{{std::addressof(loc), "the next token is not a string"}})); {{std::addressof(loc), "the next token is not a string"}}));
} }
@@ -758,7 +804,7 @@ parse_offset_datetime(location<Container>& loc)
{ {
loc.reset(first); loc.reset(first);
return err(format_underline("[error]: toml::parse_offset_datetime: ", return err(format_underline("[error]: toml::parse_offset_datetime: ",
{{std::addressof(loc), "the next token is not a local_datetime"}})); {{std::addressof(loc), "the next token is not a offset_datetime"}}));
} }
} }
@@ -1360,10 +1406,16 @@ parse_inline_table(location<Container>& loc)
return ok(std::make_pair( return ok(std::make_pair(
retval, region<Container>(loc, first, loc.iter()))); retval, region<Container>(loc, first, loc.iter())));
} }
else if(*loc.iter() == '#' || *loc.iter() == '\r' || *loc.iter() == '\n')
{
throw syntax_error(format_underline("[error] "
"toml::parse_inline_table: missing curly brace `}`",
{{std::addressof(loc), "should be `}`"}}));
}
else else
{ {
throw syntax_error(format_underline("[error] " throw syntax_error(format_underline("[error] "
"toml:::parse_inline_table: missing table separator `,` ", "toml::parse_inline_table: missing table separator `,` ",
{{std::addressof(loc), "should be `,`"}})); {{std::addressof(loc), "should be `,`"}}));
} }
} }
@@ -1374,6 +1426,46 @@ parse_inline_table(location<Container>& loc)
{{std::addressof(loc), "should be closed"}})); {{std::addressof(loc), "should be closed"}}));
} }
template<typename Container>
value_t guess_number_type(const location<Container>& l)
{
location<Container> loc = l;
if(lex_offset_date_time::invoke(loc)) {return value_t::OffsetDatetime;}
loc.reset(l.iter());
if(lex_local_date_time::invoke(loc)) {return value_t::LocalDatetime;}
loc.reset(l.iter());
if(lex_local_date::invoke(loc)) {return value_t::LocalDate;}
loc.reset(l.iter());
if(lex_local_time::invoke(loc)) {return value_t::LocalTime;}
loc.reset(l.iter());
if(lex_float::invoke(loc)) {return value_t::Float;}
loc.reset(l.iter());
return value_t::Integer;
}
template<typename Container>
value_t guess_value_type(const location<Container>& loc)
{
switch(*loc.iter())
{
case '"' : {return value_t::String; }
case '\'': {return value_t::String; }
case 't' : {return value_t::Boolean;}
case 'f' : {return value_t::Boolean;}
case '[' : {return value_t::Array; }
case '{' : {return value_t::Table; }
case 'i' : {return value_t::Float; } // inf.
case 'n' : {return value_t::Float; } // nan.
default : {return guess_number_type(loc);}
}
}
template<typename Container> template<typename Container>
result<value, std::string> parse_value(location<Container>& loc) result<value, std::string> parse_value(location<Container>& loc)
{ {
@@ -1383,32 +1475,28 @@ result<value, std::string> parse_value(location<Container>& loc)
return err(format_underline("[error] toml::parse_value: input is empty", return err(format_underline("[error] toml::parse_value: input is empty",
{{std::addressof(loc), ""}})); {{std::addressof(loc), ""}}));
} }
if(auto r = parse_string (loc))
{return ok(value(std::move(r.unwrap().first), std::move(r.unwrap().second)));}
if(auto r = parse_array (loc))
{return ok(value(std::move(r.unwrap().first), std::move(r.unwrap().second)));}
if(auto r = parse_inline_table (loc))
{return ok(value(std::move(r.unwrap().first), std::move(r.unwrap().second)));}
if(auto r = parse_boolean (loc))
{return ok(value(std::move(r.unwrap().first), std::move(r.unwrap().second)));}
if(auto r = parse_offset_datetime(loc))
{return ok(value(std::move(r.unwrap().first), std::move(r.unwrap().second)));}
if(auto r = parse_local_datetime (loc))
{return ok(value(std::move(r.unwrap().first), std::move(r.unwrap().second)));}
if(auto r = parse_local_date (loc))
{return ok(value(std::move(r.unwrap().first), std::move(r.unwrap().second)));}
if(auto r = parse_local_time (loc))
{return ok(value(std::move(r.unwrap().first), std::move(r.unwrap().second)));}
if(auto r = parse_floating (loc))
{return ok(value(std::move(r.unwrap().first), std::move(r.unwrap().second)));}
if(auto r = parse_integer (loc))
{return ok(value(std::move(r.unwrap().first), std::move(r.unwrap().second)));}
switch(guess_value_type(loc))
{
case value_t::Boolean : {return parse_boolean(loc); }
case value_t::Integer : {return parse_integer(loc); }
case value_t::Float : {return parse_floating(loc); }
case value_t::String : {return parse_string(loc); }
case value_t::OffsetDatetime : {return parse_offset_datetime(loc);}
case value_t::LocalDatetime : {return parse_local_datetime(loc); }
case value_t::LocalDate : {return parse_local_date(loc); }
case value_t::LocalTime : {return parse_local_time(loc); }
case value_t::Array : {return parse_array(loc); }
case value_t::Table : {return parse_inline_table(loc); }
default:
{
const auto msg = format_underline("[error] toml::parse_value: " const auto msg = format_underline("[error] toml::parse_value: "
"unknown token appeared", {{std::addressof(loc), "unknown"}}); "unknown token appeared", {{std::addressof(loc), "unknown"}});
loc.reset(first); loc.reset(first);
return err(msg); return err(msg);
} }
}
}
template<typename Container> template<typename Container>
result<std::pair<std::vector<key>, region<Container>>, std::string> result<std::pair<std::vector<key>, region<Container>>, std::string>
@@ -1463,7 +1551,8 @@ parse_table_key(location<Container>& loc)
} }
else else
{ {
return err(token.unwrap_err()); return err(format_underline("[error] toml::parse_table_key: "
"not a valid table key", {{std::addressof(loc), "here"}}));
} }
} }
@@ -1471,7 +1560,7 @@ template<typename Container>
result<std::pair<std::vector<key>, region<Container>>, std::string> result<std::pair<std::vector<key>, region<Container>>, std::string>
parse_array_table_key(location<Container>& loc) parse_array_table_key(location<Container>& loc)
{ {
if(auto token = lex_array_table::invoke(loc)) if(auto token = lex_array_table::invoke(loc, true))
{ {
location<std::string> inner_loc(loc.name(), token.unwrap().str()); location<std::string> inner_loc(loc.name(), token.unwrap().str());
@@ -1516,7 +1605,8 @@ parse_array_table_key(location<Container>& loc)
} }
else else
{ {
return err(token.unwrap_err()); return err(format_underline("[error] toml::parse_array_table_key: "
"not a valid table key", {{std::addressof(loc), "here"}}));
} }
} }

View File

@@ -71,7 +71,7 @@ struct location final : public region_base
"container should be randomly accessible"); "container should be randomly accessible");
location(std::string name, Container cont) location(std::string name, Container cont)
: source_(std::make_shared<Container>(std::move(cont))), line_number_(0), : source_(std::make_shared<Container>(std::move(cont))), line_number_(1),
source_name_(std::move(name)), iter_(source_->cbegin()) source_name_(std::move(name)), iter_(source_->cbegin())
{} {}
location(const location&) = default; location(const location&) = default;
@@ -88,7 +88,7 @@ struct location final : public region_base
const_iterator begin() const noexcept {return source_->cbegin();} const_iterator begin() const noexcept {return source_->cbegin();}
const_iterator end() const noexcept {return source_->cend();} const_iterator end() const noexcept {return source_->cend();}
// XXX At first, `location::line_num()` is implemented using `std::count` to // XXX `location::line_num()` used to be implemented using `std::count` to
// count a number of '\n'. But with a long toml file (typically, 10k lines), // count a number of '\n'. But with a long toml file (typically, 10k lines),
// it becomes intolerably slow because each time it generates error messages, // it becomes intolerably slow because each time it generates error messages,
// it counts '\n' from thousands of characters. To workaround it, I decided // it counts '\n' from thousands of characters. To workaround it, I decided
@@ -110,8 +110,8 @@ struct location final : public region_base
} }
void reset(const_iterator rollback) noexcept void reset(const_iterator rollback) noexcept
{ {
// since c++11, std::distance works in both ways and returns a negative // since c++11, std::distance works in both ways for random-access
// value if `first` is ahead from `last`. // iterators and returns a negative value if `first > last`.
if(0 <= std::distance(rollback, this->iter_)) // rollback < iter if(0 <= std::distance(rollback, this->iter_)) // rollback < iter
{ {
this->line_number_ -= std::count(rollback, this->iter_, '\n'); this->line_number_ -= std::count(rollback, this->iter_, '\n');

View File

@@ -572,6 +572,14 @@ class value
return *this; return *this;
} }
// for internal use ------------------------------------------------------
template<typename T, typename Container, typename std::enable_if<
detail::is_exact_toml_type<T>::value, std::nullptr_t>::type = nullptr>
value(std::pair<T, detail::region<Container>> parse_result)
: value(std::move(parse_result.first), std::move(parse_result.second))
{}
// type checking and casting ============================================ // type checking and casting ============================================
template<typename T> template<typename T>