Are there any existing projects or proposed standards for using git (a source version control system) to collaborate on building genealogical source records and/or conclusions?
Git (best known on hosted sites like github and bitbucket, but available as a command-line or graphical application on most platforms) is a way to store (mostly) text and programming source files, so that changes to individual lines within the files can be made by different people in different places, and merged together with a full audit trail of who changed what (called "blame").
It's also possible to split data off into a branch, make proposed changes to it, and then submit it back to those responsible for the main branch as suggested improvements, which can be accepted or rejected, and merged back into the main branch. So it's not a free-for-all like a wiki, there are ways to have control over data yet still accept improvements.
It seems it would make a good fit for genealogy source data, which is often text-based (perhaps with images attached). It would make it easy for transcriptions to be added to and corrected by many people, and more importantly, see who changed it.
It could also be a way to store assertions, and provenance trails - show how a conclusion was reached, in a format that allows corrections and new data.
There are projects around to store gedcom data in git (example: git-ged), which is one way of handling legacy data, but I'm thinking it would be more useful for newer standards that are source-based, rather than just for conclusion reporting formats like Gedcom.
Many of the new standards like Gedcom-X are XML-based (perhaps with an alternate json representation). Neither XML or (to a lesser extent) json map well to a line-based version control system (because it can be hard to retain the structure when a individual lines are changed by different people at the same time). It's also difficult for a human to edit those formats.
Although gedcom is line based it again has a structure that is easy to mess up with multiple editors, and has other limitations (lots of cross-references that are easy to break).
Git (and github) have been extended to display data in different ways, for example to map the data if it is in geojson (a geographic point/line format), or to show tables if it is in csv. So there could be certain requirements (restrictions) of a text-based genealogy format that would make it possible to generate reports from it. It doesn't have to be directly usable in a program, just something that can be edited by multiple people in a fairly easy way.
For source data (lists of events, simple transcriptions) it's maybe simplest just to stick to csv (comma separated variables) files, that can be easily imported (and exported) by most software.
It would be good to have a csv description language with a genealogy vocabulary that describes the data. The simple data format is one such way to describe csv files. Has anybody been working on marking up genealogy source files with a system like this?
Much of this assumes that the source data (events) are in a format which is shareable (and so editable by many). Unfortunately very little data currently has suitable licenses, but that is changing.