This isn't just the oldest recorded transaction, it's nearly the oldest known recognizable sample of human writing. Not a love letter or a sermon or a story, but a receipt. This probably reflects their ubiquity rather than importance. There is one known older writing sample, the Kish Tablet of Jemet Nasr. Since that tablet represents lists and counts of goods (barley, oil, livestock), it may also be a receipt, or perhaps an inventory.
The oldest known non-commercial writing is a set of proverbs from around 2600 BCE, Instructions of Shuruppak.
With my luck my most cringe-worthy diary entries will probably last that long.
ants_everywhere 1 days ago [-]
One of the theories of how writing was invented was via transactions and accounting.
You start keeping items in clay jars. You eventually mark the jars with a depiction of what's in it. Those marks begin standing in for the items themselves when communicating across languages or keeping records of how many items and jars you have.
tycho-newman 42 minutes ago [-]
Writing probably arose from reading animal tracks in soft mud.
jeanlucas 10 hours ago [-]
Impossible to truly know. Writing may well have started with doodles, notes, even jokes on materials like leaves or wood that didn’t survive.
What survives are the "important" texts because you would deliberately put them on durable material. That creates a bias where early writing looks purely transactional.
Same reason we think of pyramids when we think of ancient architecture: stone lasts, wood doesn’t.
ants_everywhere 7 hours ago [-]
That's true we don't know anything about markings that were made on organic materials.
We do know that art and other markings date tens of thousands of years before the first proto-writing. Writing is specifically about markings that form a language. So doodles and visual jokes (e.g. phalluses) wouldn't count. I don't know what you mean by notes, but writing notes without a language would be difficult I suspect.
But there could early languages that were written on organic materials. The main problem is there's a bootstrapping problem where you need to account for how the first one developed at all. After that you can continuously improve over time.
maratc 1 days ago [-]
> Not a love letter or a sermon or a story, but a receipt. This probably reflects their ubiquity rather than importance.
We humans are pretty good at remembering sermons and stories and we can recreate them from memory and pass them down to the next generations. We however suck at remembering numbers, that's why we invented writing so we could write the numbers down and rely on these records, instead of on bad human memory.
lazide 19 hours ago [-]
Well, we especially suck at remembering numbers when it’s something like ‘how much I owe Bob’ hah. At least for many people.
I expect this writing was a way to help reduce civil unrest/murder by reducing he said/she said arguments about goods, services, money, etc.
ahmedfromtunis 1 days ago [-]
That was not by (deliberate) choice.
The earliest writings were actually logographic or semasiographic, meaning they represented ideas, objects, or concepts directly rather than the sounds of a specific spoken language.
We actually don't know what language(s) was/were spoken by the people who recorded the earliest tablets (not sure if that also applies to this particular one, though).
Phonographic writing developed much later and with it came all the forms of textual recordings we're familiar with.
thaumasiotes 1 days ago [-]
> Phonographic writing developed much later
Well, the earliest signs are logographic.
But phonographic writing didn't take long to develop. Once you've got a few logographs, it becomes apparent immediately that you can't extend that approach to everything you can say.
dotancohen 1 days ago [-]
> Once you've got a few logographs, it becomes apparent immediately that you can't extend that approach to everything you can say.
The converse is just as true. Not all things you can think, you can say. I remember sometime in my teens realizing that my thoughts are constrained by my language, an epiphany that sparked a life long interest in language. Now some 30 years later, I feel that I can feel ideas that I don't know how to express, but not for lack of language. Rather, some ideas are too complex for our simple speech. Just as a dog would be unable to bark the idea "energy is neither created nor destroyed".
smj-edison 1 days ago [-]
Bit of a tangent, but have you followed dynamic land at all? Their whole thing is expressing ideas through a dynamic medium, to convey things we can't explain through speech. You might find it interesting :)
Do you have an example? I can’t possibly think of a single idea that’s completely expressionless. Even drug-fueled hallucinations can eventually be given a description; albeit without being able to actually transfer the feeling/internalization of it.
You might have to be overly verbose and explicit in your language, but ultimately you can describe pretty much anything using “like”, “as”, and “akin to” with qualifiers.
dotancohen 20 hours ago [-]
I think that any parent holding their baby for the first time will give an example. There is a feeling of existing, of purpose, of continuity. But no "like", "as", or "akin to" suffices.
7 hours ago [-]
cyberax 1 days ago [-]
> But phonographic writing didn't take long to develop.
But it did. It took around 1500 years from the first writing systems to fully phonetic systems. And we still have Chinese characters even now, or the Tibetan writing system.
For some reason, writing systems tend to stay stuck on mixed logographic and phonetic systems.
OskarS 23 hours ago [-]
There's always a sliding scale between "proto-writing" and fully developed writing systems, but just using symbols phonetically instead of semantically happens much faster than that. The very archaic forms of proto-cuneiform is from about 3400 BCE (though things like clay tokens are much older). It, like virtually all writing systems in the world, developed into a true writing system by use of the "rebus principle", where symbols came to acquire different meanings based on phonetics in the same way as in a rebus. Like, in English, if you had a symbol for "female sheep" (a "ewe"), you could start to use it to signify the word "you", even though there's no semantic connection.
The earliest evidence for this in cuneiform is from around 3200-3000 BCE. There is a famous tablet where the symbol for "reed" is used to represent the word "reimburse", because they're both pronounced like gi. By a few hundred years later, cuneiform was a fully fledged phonetic writing system.
prewett 20 hours ago [-]
My understanding of the history of Chinese writing is that it kept trying to go phonetic, but each time they prevented it, because the writing had to be read across an empire with multiple languages. Even so, something like 20% of the characters are quasi-phonetic, with the radical giving the topic and the rest of the character giving the approximate pronunciation, so "the word for {a plant|a thing of metal|a person|etc} that is sounds similar to X".
When the Japanese imported it, they used the characters much more phonetically. They used the whole word when that worked, but the characters got assigned to the Japanese pronunciation of the word, as well as the pronunciation from the pieces of other words where that character appeared, as well as the Chinese pronunciations. Then six hundred years or so later they imported them again, by which point the characters had evolved in Chinese but not in Japanese. So its sort of phonetic, but it's a complete mess.
In Japanese at this point most kanji have an onyomi (the sound of the Chinese word, which has been adopted into Chinese the way Latin words like "adopt" are adopted into English) and at least one kunyomi (the sound of a synonymous Japanese word not derived from Chinese). This does add difficulty but it is somewhat compensated for by the smaller repertoire of characters used in Japanese. A lot of the most common Japanese words, all loanwords from languages like English, and all the inflectional suffixes are normally written with one of two purely phonetic syllabaries.
thaumasiotes 2 hours ago [-]
I would take issue with
>> When the Japanese imported it, they used the characters much more phonetically.
Japanese kanji are much less phonetic than Chinese hanzi. For hanzi, you can ask "how is this character read?", and it's a simple question with a simple answer, because that question is the basis of the writing system. Kanji are assigned all kinds of different readings on the theory that what really counts is the semantics.
For example...
>> They used the whole word when that worked
Not even in the oldest Chinese writings do you see one character representing a multisyllabic word. Identifying characters with words rather than syllables is an innovation on the part of the Japanese.
Tangentially, you mentioned that the vast majority of characters are phono-semantic compounds. I've been watching some youtube videos in which Japanese people are presented with kanji of varying levels of obscurity and asked to speculate on their pronunciation. Without fail, when they don't know the answer, the interviewees speculate that the two major components of the character both contribute to its meaning.
And that always surprises me because a two-meaningful-components construction is so rare in the character system. Almost all characters aren't constructed from two meaningful elements, and I would have thought the Japanese would be familiar with that fact even though they can't understand the phonetic hints. Do you think this is more of a case of them not knowing how characters are formed ("ignorance"), or more of a case of them speculating on the meaning of each component purely because they don't have the ability to speculate about the phonetics ("searching under the lamppost")?
[Particularly where the obscure kanji are part of an obscure phrase borrowed from Chinese, speculating about the phonetics would be helpful to the problem, but I'm assuming most Japanese just plain don't know what kinds of sounds a Chinese phonetic component might be hinting at.]
thaumasiotes 1 days ago [-]
The modern Chinese writing system is fully phonetic, just with extremely complex spelling. There is no pretense that characters represent ideas or words. They represent syllables.
Phonetic use of the characters was immediate. The go-to example here is 來, which depicts a stalk of wheat. It is the spelling of the verb "come", and the verb is spelled that way because the character for "wheat" was borrowed with no alterations to represent its own pronunciation, which was shared with the verb.
cyberax 22 hours ago [-]
I speak Chinese :)
It's a mixed system with about 2 millennia of legacy. It started as logographic, then it got into phono-semantic compounds, with detours into the written-only official language (like Latin), and now it's messy mix of everything. There are true logographs (休,林,森), true phonosemantic compounds, and plenty purely phonetic characters that have no meaning by themselves ("bound morphemes").
thaumasiotes 22 hours ago [-]
> now it's messy mix of everything. There are true logographs
Don't confuse the origin of the system with what the system is now.
Using your example, what do you see as the difference between the "logographs" 森 and 林?
Neither can be a logograph, because neither one represents a word. But even if that weren't the case, on the assumption that they are simply pictures representing concepts, how would you know which one was which?
What does it mean, to you, that the word "forest" must be written 森林 and not 森?
> and [there are] plenty purely phonetic characters that have no meaning by themselves ("bound morphemes").
...yes. 森 and 林 both belong to that category. But you've specifically contrasted them with it. I can't tell what you're thinking of.
Characters can be classified by origin, so that 森 is "从林从木", 切 is "从刀七声", and 下 is "指事". You seem to be reaching for this, but "bound morpheme" is a classification of the current use of the linguistic element, not of the origin of the way it's spelled.
kragen 11 hours ago [-]
图样图森破
thaumasiotes 2 hours ago [-]
I'm not sure I follow you, but it seems worth noting that those five characters are the transcription of an English phrase. (In this case, "森" is just the first half of the word "simple".)
sameermanek 1 days ago [-]
It was invented by store managers to counter the karens of that time. /s
Oops, im not on reddit, sorry
tzury 1 days ago [-]
Quote:
I wonder how people store dates older than this.
Maybe if I’m a British Museum manager, and I want to keep theft inventory details.
How do I do it? As an epoch? Store it as text?
The answer: Text.
Many items in museums have no specific date but Circa X.
I have spent a lot of time in the early 2000s to enable "Sort by date" in museum registrars software I was maintaining despite having it textual
ghurtado 1 days ago [-]
> Sort by date" in museum registrars software
This sounds like the perfect invitation for some old school over engineering.
I'm already having so much fun running through every possible input in my head, and I would inevitably write a serious mountain of steaming code to support it.
tzury 1 days ago [-]
I simply built a side table in the DB, whereas any expression was associated with a range of a 2 YEARS (numbers).
any time they enter and expression (auto complete), it they introduce a new one, they needed to add the range.
this did the job.
the time I spent the most was to sort the existing data and restore it in the new dictionary.
kehvyn 11 hours ago [-]
I've actually done this, and it's very fun.
My main testing dataset is the 470,000 records from the Met, with 33k unique date values. Fortunately they include machine-readable dates I can validate against.
That's got to be a study in exceptions. Let's start with which calendar we're referencing. C14 anybody?
Waraqa 1 days ago [-]
I assume an integer field is sufficient since mostly it's only the year that they know not the exact date.
Atlas667 1 days ago [-]
Text is better together with specific formats like Circa, ranges, exact years or dates, and unknown.
brazzy 1 days ago [-]
No, you need a lot more complexity if you really wanted to represent it semantically. The assumption that people in the past used calendars with sequentially numbered years you just need to offset, is simply wrong.
You have things like "in the Xth year of the reign of King Y", where we can easily relate multiple entries with different values for X, but don't actually know which CE years they correspond to. Even weirder is the Roman habit of recording "the year of the consulship of X and Y", which doesn't even allow you to relate any two different years at all without a reference table (which we don't have completely). And no, "years from the foundong of the city" wasn't a thing.
kehvyn 10 hours ago [-]
I actually looked at supporting dates like this, but if you go through the Met's open dataset (https://github.com/metmuseum/openaccess) that kind of "alternative calendar with no reference to the BCE/CE dates" is basically nonexistent.
There are references to the Islamic and Japanese calendar systems, but always next to the CE equivalent.
Data entry is fortunately being done by modern people, so the translation to CE/BCE is usually baked in, and all you need to support is every possible way somebody could say "the early half of" and "5th millenium B.C. to mid 1914"
> The Varronian chronology was adopted by the Roman state during the first century BC and gave rise to the traditional years ab urbe condita ("from the founding of the city"); most especially, those dates were used in monumental Augustan-era inscriptions, the fasti Capitolini and the fasti Triumphales.[40]
lazide 1 days ago [-]
Turing complete DSL, here we come!
rossant 1 days ago [-]
Tiny language model.
divbzero 24 hours ago [-]
> The answer: Text.
That was my immediate thought too and led to me wondering: How do you represent BCE dates in ISO 8601?
Apparently ISO 8601 always supports YYYY from 0000 (1 BCE) to 9999 (9999 CE). ISO 8601 can also extend beyond those limits if agreed upon by sender and receiver: e.g. -0001 (2 BCE), -0002 (3 BCE), etc.
ralferoo 15 hours ago [-]
Hopefully nobody uses this "standard" that bakes in an off-by-one error into the human readable form.
IMHO if code is doing extra parsing to handle -ve years, they should have enough logic to know to how to skip the zeroth year when converting to and from the human readable form.
wvbdmp 13 hours ago [-]
Okay… someone please steelman this seemingly unhinged decision.
edit: Apparently that’s how they do dates in astronomy since it makes the math easier. Can’t even count on years being gregorian these days…
scrollaway 18 hours ago [-]
I think the world could use an “imprecise” data type, which would be a tuple (t, margin).
In your case: if you wanted a date plus minus 50yrs, that would be (date(d), range(years, 50)).
Some construction like this allows for I believe most use cases. You just need to be able to store: date, date time, date range, and the precise/imprecise versions of all of these.
jcims 24 hours ago [-]
I'm 52 years old and it has been this way since I can remember but for some reason I can't make it not bug me. Any time we have the biggest/oldest/smallest/fastest/etc example of something, it's described without any qualification of seen, known, observed, etc.
For example, this isn't oldest recorded transaction, it's the oldest widely known record of a transaction (probably).
Why does that still bother me? Obviously nobody is saying it's the oldest recorded transaction, right? That would make it the first recorded transaction, and nobody is calling it that.
And here I am likely triggering your own pet peeve of useless comments on HN. xD
namenotrequired 11 hours ago [-]
It can be the oldest without being the first, if the earlier ones no longer exists
throw0101c 1 days ago [-]
Some more info:
> This tablet with early writing most likely documents grain distributed by a large temple. Scholars have distinguished two phases in the development of writing in southern Mesopotamia. The earliest tablets, probably dating to around 3300 B.C., record economic information using pictographs and numerals drawn in the clay. A later phase, as represented by this tablet, reflects changes in the techniques of writing that altered the shapes of signs. Symbols stood for nouns, primarily names of commodities, as well as a few basic adjectives, but no grammatical elements. Such a system could be read in any language, but it is generally accepted that the underlying language is Sumerian. Indeed, by the first half of the third millennium B.C., the script had sufficiently developed to faithfully represent the Sumerian language, and the scope and application of writing was expanded to include written poetry. Nonetheless, even these later scribes rarely included grammatical elements, and the texts, created as memory aids, cannot be easily read today.
> A later phase, as represented by this tablet, reflects changes in the techniques of writing that altered the shapes of signs. Symbols stood for nouns, primarily names of commodities, as well as a few basic adjectives, but no grammatical elements.
From Weavers, Scribes, and Kings:
> The reason that the artist immortalized Ushumgal and Shara-igizi-Abzu is that they were involved in a transaction so important that a record of it was carved onto a stone boulder, complete with pictures of the main parties. The roughly drawn cuneiform signs that litter the sides of the boulder, and even extend over the figures themselves, record that this transaction pertained to animals, land, and houses, in large quantities: 450 iku of fields are mentioned (about 158 hectares or 392 acres), along with three houses and some bulls, donkeys, and sheep. Unfortunately, the inscription suffers from a dire shortage of verbs, which would have been useful in determining what exactly was going on.
No, it is proposing an ambitious rearchitecture of current systems on new principles which it argues were present in an incomplete form in many existing systems. Since it was written, some of its proposals have been implemented to generally good effect, but by no means all. It's a visionary manifesto for the future, though filled with practical information.
For people who don't want to be assaulted by aggressive ads
1 days ago [-]
Boogie_Man 1 days ago [-]
Excellent and straightforward negotiation, reminds me a bit of how mobsters speak in film combined with how God speaks in the old testament.
ants_everywhere 1 days ago [-]
dude probably needed to make some swords or something
behnamoh 1 days ago [-]
> I call it rock solid durability.
Literally! But this is survivor bias: you only see a piece that remained intact for 5k years, and I bet 99% of them were eroded/destroyed over time.
rthnbgrredf 1 days ago [-]
While survivor bias is relevant, I strongly doubt any modern transaction stored digitally in a DB such as Postgres could last 5k years.
18 hours ago [-]
tonyhart7 1 days ago [-]
but they can tho??? no one said about transferring such data into another disk
MangoToupe 1 days ago [-]
The IBM 80-hole punch card will only turn 100 in 2028. Who knows what the world will look like in 2128.
numbsafari 1 days ago [-]
I bet the clay tablet will be just fine. No word on moldy cards.
thaumasiotes 1 days ago [-]
Note that it isn't the norm for clay tablets to survive. We have lots of them, far more than we're willing to provide the manpower to read, but in most cases[1] that's not because they were made to be durable.
Whenever a city was conquered, the tablets there were immortalized as the city burned down. But cities that didn't get sacked didn't burn down, and their tablets could be lost. For example, we don't have the clay records from Hammurabi's reign in Babylon, because (a) he was a strong king, and Babylon wasn't conquered on his watch; and (b) he reigned a long time ago, and that period of Babylon sank below the water table, dissolving all the records.
[1] Some tablets were intentionally fired for posterity.
IceHegel 1 days ago [-]
The idea that artifacts belong forever to whoever inhabits the land today is going to put under increasing pressure as ancient DNA continues to reveal the number and severity of population replacements over time.
ccorcos 1 days ago [-]
I think these numerical constraints are because range trees use numerical averages to construct themselves. This is important for OVERLAPS queries common with dates. But you could construct interval tree indexes lexicographically using text but they are quite uncommon. It’s something I’ve experimented with a decent amount though.
I expected there would be constraints, but the chosen range is quite intriguing. The PostgreSQL spec says the 4-byte date type spans 4713 BC to 5,874,897 AD. It gives much more headroom for future dates—did they assume preserving data before 4713 BC is unlikely?
bloak 14 hours ago [-]
That range of dates seems to correspond to (1UL << 31) days so I suspect they're using only 31 bits so I wonder why they didn't make it signed and extend it to 5,884,322 BC.
foundart 1 days ago [-]
Wow, so much older than the oldest known complaint
You can store whatever ancient timestamp you want in SQLite, its Just Text or Just an Integer although the date library functions may not support it
csomar 15 hours ago [-]
> I wonder how people store dates older than this. Maybe if I’m a British Museum manager, and I want to keep theft inventory details. How do I do it? As an epoch? Store it as text? Use some custom system? How do I get it to support all the custom operations that a typical TIMESTAMP supports?
Think about how the museum physical text book store it, as simple text with processing offloaded to the reader (ie: circa 4000BC, Before 2000BC, After ...)
I wonder, if for some problems, we'll move to LLM computation instead of a developer coded solution.
Your variables will be
let date_1 = "2000 BC"
let date_2 = "3000 B.C."
and when you execute
if date_1 > date_2 { .. do something .. }
The ">" operator is overloaded to run this operation through an LLM and return True/False.
Essentially they have an "Object Date" field that's a human-readable string and could be anything, and then they include "Object Start Date" and "Object End Date" that are integer years so that it's machine readable and you can do those comparisons.
DaveZale 1 days ago [-]
beer drinking goes waaaay back
ceejayoz 1 days ago [-]
Even kids drank (weak) beer, in part because the alcohol content kills pathogens. Even today clean water can be tough to find for some folks.
behringer 1 days ago [-]
There are many who think society itself was formed in order to make alcohol. Without alcohol there would be little reason to grow so much grain and thus little reason to have so many people in one place.
NL807 1 days ago [-]
malt is used in baking as well
DaveZale 1 days ago [-]
they evolved in tandem. Back in the day, old bread would be used as a starter culture for beer.
One can also use live-beer when making a starter dough. Neat flavours.
DaveZale 1 days ago [-]
cool, I'll try that, muchos gracias
metalman 1 days ago [-]
way before when the tablet was made, as residues on pottery 5 thousand years older, 8-9-10 k yrs bp
show that grains were soaked and lightly fermented, to increase nutritional content, palatability/texture, with this practice bieng practiced all the way through the stone ages.
some have suggested that fermentation was the primary impetus for building the first semi permanent dwellings....beer first, somewhere to hang out was a bonus
pipeline_peak 1 days ago [-]
Me buying a PlayStation 1 in 1997
barbazoo 1 days ago [-]
> Considering this thing survived 5000 years (holy shit!) with zero downtime and has stronger durability guarantees than most databases today.
One could argue that it had 5000 years of downtime when no one knew where it was /s
all2 1 days ago [-]
Network connectivity was an issue, though. If we look at only network uptime, it's quite low. Something like 0.02% uptime over 5000 years.
curtisszmania 1 days ago [-]
[dead]
YJfcboaDaJRDw 1 days ago [-]
[dead]
hagbard_c 1 days ago [-]
Interesting write-up marred by the injection of politics: Maybe if I’m a British Museum manager, and I want to keep -theft- inventory details
Ideological jabs like this are fine in political discussions but they don't add anything elsewhere and serve only to lower the trustworthiness of what is written due to implied bias.
This is not an academic piece but a blog which is trying to be light hearted.. The first sentence says
"The other day I posted a tweet with this image which I thought was funny:'
So not being 100% serious is to be expected.
aaronharnly 1 days ago [-]
As an aside to this aside on the aside…
I've gotten into reading Tintin books with my kid, as I did when I was about his age. They're grand adventures and sort-of progressive, for their era.
But the basic structure of many of the stories is still basically "let's get this rare artifact from [South America, Africa, Asia] out of the hands of the thieves stealing it, and back into a museum in England, where it belongs!" And I gotta say it grates.
antonvs 15 hours ago [-]
I imagine the location of the museum would have been different in the original French, but the same European colonial attitude applies.
croisillon 12 hours ago [-]
* Belgian
ascorbic 9 hours ago [-]
They were written in French.
kragen 11 hours ago [-]
Not a thing.
antonvs 15 hours ago [-]
What's the implied bias? That they like facts?
novemp 24 hours ago [-]
Thanks for the confirmation that "politics" just means "facts".
The oldest known non-commercial writing is a set of proverbs from around 2600 BCE, Instructions of Shuruppak.
With my luck my most cringe-worthy diary entries will probably last that long.
You start keeping items in clay jars. You eventually mark the jars with a depiction of what's in it. Those marks begin standing in for the items themselves when communicating across languages or keeping records of how many items and jars you have.
What survives are the "important" texts because you would deliberately put them on durable material. That creates a bias where early writing looks purely transactional.
Same reason we think of pyramids when we think of ancient architecture: stone lasts, wood doesn’t.
We do know that art and other markings date tens of thousands of years before the first proto-writing. Writing is specifically about markings that form a language. So doodles and visual jokes (e.g. phalluses) wouldn't count. I don't know what you mean by notes, but writing notes without a language would be difficult I suspect.
But there could early languages that were written on organic materials. The main problem is there's a bootstrapping problem where you need to account for how the first one developed at all. After that you can continuously improve over time.
We humans are pretty good at remembering sermons and stories and we can recreate them from memory and pass them down to the next generations. We however suck at remembering numbers, that's why we invented writing so we could write the numbers down and rely on these records, instead of on bad human memory.
I expect this writing was a way to help reduce civil unrest/murder by reducing he said/she said arguments about goods, services, money, etc.
The earliest writings were actually logographic or semasiographic, meaning they represented ideas, objects, or concepts directly rather than the sounds of a specific spoken language.
We actually don't know what language(s) was/were spoken by the people who recorded the earliest tablets (not sure if that also applies to this particular one, though).
Phonographic writing developed much later and with it came all the forms of textual recordings we're familiar with.
Well, the earliest signs are logographic.
But phonographic writing didn't take long to develop. Once you've got a few logographs, it becomes apparent immediately that you can't extend that approach to everything you can say.
https://dynamicland.org/
You might have to be overly verbose and explicit in your language, but ultimately you can describe pretty much anything using “like”, “as”, and “akin to” with qualifiers.
But it did. It took around 1500 years from the first writing systems to fully phonetic systems. And we still have Chinese characters even now, or the Tibetan writing system.
For some reason, writing systems tend to stay stuck on mixed logographic and phonetic systems.
The earliest evidence for this in cuneiform is from around 3200-3000 BCE. There is a famous tablet where the symbol for "reed" is used to represent the word "reimburse", because they're both pronounced like gi. By a few hundred years later, cuneiform was a fully fledged phonetic writing system.
When the Japanese imported it, they used the characters much more phonetically. They used the whole word when that worked, but the characters got assigned to the Japanese pronunciation of the word, as well as the pronunciation from the pieces of other words where that character appeared, as well as the Chinese pronunciations. Then six hundred years or so later they imported them again, by which point the characters had evolved in Chinese but not in Japanese. So its sort of phonetic, but it's a complete mess.
Not 20%, more like 90%. https://en.wikipedia.org/wiki/Chinese_character_classificati...
In Japanese at this point most kanji have an onyomi (the sound of the Chinese word, which has been adopted into Chinese the way Latin words like "adopt" are adopted into English) and at least one kunyomi (the sound of a synonymous Japanese word not derived from Chinese). This does add difficulty but it is somewhat compensated for by the smaller repertoire of characters used in Japanese. A lot of the most common Japanese words, all loanwords from languages like English, and all the inflectional suffixes are normally written with one of two purely phonetic syllabaries.
>> When the Japanese imported it, they used the characters much more phonetically.
Japanese kanji are much less phonetic than Chinese hanzi. For hanzi, you can ask "how is this character read?", and it's a simple question with a simple answer, because that question is the basis of the writing system. Kanji are assigned all kinds of different readings on the theory that what really counts is the semantics.
For example...
>> They used the whole word when that worked
Not even in the oldest Chinese writings do you see one character representing a multisyllabic word. Identifying characters with words rather than syllables is an innovation on the part of the Japanese.
Tangentially, you mentioned that the vast majority of characters are phono-semantic compounds. I've been watching some youtube videos in which Japanese people are presented with kanji of varying levels of obscurity and asked to speculate on their pronunciation. Without fail, when they don't know the answer, the interviewees speculate that the two major components of the character both contribute to its meaning.
And that always surprises me because a two-meaningful-components construction is so rare in the character system. Almost all characters aren't constructed from two meaningful elements, and I would have thought the Japanese would be familiar with that fact even though they can't understand the phonetic hints. Do you think this is more of a case of them not knowing how characters are formed ("ignorance"), or more of a case of them speculating on the meaning of each component purely because they don't have the ability to speculate about the phonetics ("searching under the lamppost")?
[Particularly where the obscure kanji are part of an obscure phrase borrowed from Chinese, speculating about the phonetics would be helpful to the problem, but I'm assuming most Japanese just plain don't know what kinds of sounds a Chinese phonetic component might be hinting at.]
Phonetic use of the characters was immediate. The go-to example here is 來, which depicts a stalk of wheat. It is the spelling of the verb "come", and the verb is spelled that way because the character for "wheat" was borrowed with no alterations to represent its own pronunciation, which was shared with the verb.
It's a mixed system with about 2 millennia of legacy. It started as logographic, then it got into phono-semantic compounds, with detours into the written-only official language (like Latin), and now it's messy mix of everything. There are true logographs (休,林,森), true phonosemantic compounds, and plenty purely phonetic characters that have no meaning by themselves ("bound morphemes").
Don't confuse the origin of the system with what the system is now.
Using your example, what do you see as the difference between the "logographs" 森 and 林?
Neither can be a logograph, because neither one represents a word. But even if that weren't the case, on the assumption that they are simply pictures representing concepts, how would you know which one was which?
What does it mean, to you, that the word "forest" must be written 森林 and not 森?
> and [there are] plenty purely phonetic characters that have no meaning by themselves ("bound morphemes").
...yes. 森 and 林 both belong to that category. But you've specifically contrasted them with it. I can't tell what you're thinking of.
Characters can be classified by origin, so that 森 is "从林从木", 切 is "从刀七声", and 下 is "指事". You seem to be reaching for this, but "bound morpheme" is a classification of the current use of the linguistic element, not of the origin of the way it's spelled.
Oops, im not on reddit, sorry
Many items in museums have no specific date but Circa X. I have spent a lot of time in the early 2000s to enable "Sort by date" in museum registrars software I was maintaining despite having it textual
This sounds like the perfect invitation for some old school over engineering.
I'm already having so much fun running through every possible input in my head, and I would inevitably write a serious mountain of steaming code to support it.
any time they enter and expression (auto complete), it they introduce a new one, they needed to add the range.
this did the job.
the time I spent the most was to sort the existing data and restore it in the new dictionary.
My main testing dataset is the 470,000 records from the Met, with 33k unique date values. Fortunately they include machine-readable dates I can validate against.
https://github.com/kjrocker/epochal
You have things like "in the Xth year of the reign of King Y", where we can easily relate multiple entries with different values for X, but don't actually know which CE years they correspond to. Even weirder is the Roman habit of recording "the year of the consulship of X and Y", which doesn't even allow you to relate any two different years at all without a reference table (which we don't have completely). And no, "years from the foundong of the city" wasn't a thing.
There are references to the Islamic and Japanese calendar systems, but always next to the CE equivalent.
Data entry is fortunately being done by modern people, so the translation to CE/BCE is usually baked in, and all you need to support is every possible way somebody could say "the early half of" and "5th millenium B.C. to mid 1914"
https://en.wikipedia.org/wiki/Varronian_chronology
> The Varronian chronology was adopted by the Roman state during the first century BC and gave rise to the traditional years ab urbe condita ("from the founding of the city"); most especially, those dates were used in monumental Augustan-era inscriptions, the fasti Capitolini and the fasti Triumphales.[40]
That was my immediate thought too and led to me wondering: How do you represent BCE dates in ISO 8601?
Apparently ISO 8601 always supports YYYY from 0000 (1 BCE) to 9999 (9999 CE). ISO 8601 can also extend beyond those limits if agreed upon by sender and receiver: e.g. -0001 (2 BCE), -0002 (3 BCE), etc.
IMHO if code is doing extra parsing to handle -ve years, they should have enough logic to know to how to skip the zeroth year when converting to and from the human readable form.
edit: Apparently that’s how they do dates in astronomy since it makes the math easier. Can’t even count on years being gregorian these days…
In your case: if you wanted a date plus minus 50yrs, that would be (date(d), range(years, 50)).
Some construction like this allows for I believe most use cases. You just need to be able to store: date, date time, date range, and the precise/imprecise versions of all of these.
For example, this isn't oldest recorded transaction, it's the oldest widely known record of a transaction (probably).
Why does that still bother me? Obviously nobody is saying it's the oldest recorded transaction, right? That would make it the first recorded transaction, and nobody is calling it that.
And here I am likely triggering your own pet peeve of useless comments on HN. xD
> This tablet with early writing most likely documents grain distributed by a large temple. Scholars have distinguished two phases in the development of writing in southern Mesopotamia. The earliest tablets, probably dating to around 3300 B.C., record economic information using pictographs and numerals drawn in the clay. A later phase, as represented by this tablet, reflects changes in the techniques of writing that altered the shapes of signs. Symbols stood for nouns, primarily names of commodities, as well as a few basic adjectives, but no grammatical elements. Such a system could be read in any language, but it is generally accepted that the underlying language is Sumerian. Indeed, by the first half of the third millennium B.C., the script had sufficiently developed to faithfully represent the Sumerian language, and the scope and application of writing was expanded to include written poetry. Nonetheless, even these later scribes rarely included grammatical elements, and the texts, created as memory aids, cannot be easily read today.
* https://www.metmuseum.org/art/collection/search/327385
* https://en.wikipedia.org/wiki/Jemdet_Nasr_period
From Weavers, Scribes, and Kings:
> The reason that the artist immortalized Ushumgal and Shara-igizi-Abzu is that they were involved in a transaction so important that a record of it was carved onto a stone boulder, complete with pictures of the main parties. The roughly drawn cuneiform signs that litter the sides of the boulder, and even extend over the figures themselves, record that this transaction pertained to animals, land, and houses, in large quantities: 450 iku of fields are mentioned (about 158 hectares or 392 acres), along with three houses and some bulls, donkeys, and sheep. Unfortunately, the inscription suffers from a dire shortage of verbs, which would have been useful in determining what exactly was going on.
https://www.thearchaeologist.org/blog/complaint-tablet-to-ea...
https://knowyourmeme.com/memes/complaint-tablet-to-ea-nasir
For people who don't want to be assaulted by aggressive ads
Literally! But this is survivor bias: you only see a piece that remained intact for 5k years, and I bet 99% of them were eroded/destroyed over time.
Whenever a city was conquered, the tablets there were immortalized as the city burned down. But cities that didn't get sacked didn't burn down, and their tablets could be lost. For example, we don't have the clay records from Hammurabi's reign in Babylon, because (a) he was a strong king, and Babylon wasn't conquered on his watch; and (b) he reigned a long time ago, and that period of Babylon sank below the water table, dissolving all the records.
[1] Some tablets were intentionally fired for posterity.
https://github.com/ccorcos/database-experiments/blob/master/...
so about 42175 BC
:)
[1] https://en.wikipedia.org/wiki/Tally_stick#Possible_palaeolit...
https://en.m.wikipedia.org/wiki/Complaint_tablet_to_Ea-n%C4%...
Think about how the museum physical text book store it, as simple text with processing offloaded to the reader (ie: circa 4000BC, Before 2000BC, After ...)
I wonder, if for some problems, we'll move to LLM computation instead of a developer coded solution.
Your variables will be
and when you execute The ">" operator is overloaded to run this operation through an LLM and return True/False.Essentially they have an "Object Date" field that's a human-readable string and could be anything, and then they include "Object Start Date" and "Object End Date" that are integer years so that it's machine readable and you can do those comparisons.
https://www.getty.edu/360/event_images/mesobeer.pdf
One could argue that it had 5000 years of downtime when no one knew where it was /s
Ideological jabs like this are fine in political discussions but they don't add anything elsewhere and serve only to lower the trustworthiness of what is written due to implied bias.
This is not an academic piece but a blog which is trying to be light hearted.. The first sentence says "The other day I posted a tweet with this image which I thought was funny:' So not being 100% serious is to be expected.
I've gotten into reading Tintin books with my kid, as I did when I was about his age. They're grand adventures and sort-of progressive, for their era.
But the basic structure of many of the stories is still basically "let's get this rare artifact from [South America, Africa, Asia] out of the hands of the thieves stealing it, and back into a museum in England, where it belongs!" And I gotta say it grates.