Analysis of PLM Software Conflict Resolution: Difference between revisions
No edit summary |
|||
Line 51: | Line 51: | ||
*[[CAD PLM]] | *[[CAD PLM]] | ||
*[[Home made PLM project]] | *[[Home made PLM project]] | ||
=Feedback= | |||
==[[Yorik]]== | |||
Hi Marcin, | |||
Hmm complex problem, lots of people discussing and argueing about it since long... | |||
My two cents: | |||
True real-time collaboration requires a complex, dedicated protocol where each and every "move" is registered. The only I know of was something developed by Eskil Steenberg for Blender (and an online game he did called "love") called Verse. It kind of worked. But there is a huge amount of data to be transmitted, so it failed quite easily. But you could really see other people push the vertices in real time... | |||
While this is good for playing and "rough 3D sketching", it's not that interesting for more accurate design. Changes are so fast and so many that the "changes stack", or model history, becomes absurdly complex, and therefore unuseful. Also, speaking of experience, comparing working with dropbox-like solutions where each file save gets recorded, with working with git-like solutions, where commits are a decision of the user, with a meaningful message and a chosen set of file, I wouldn't hesitate one second to say the latter is far more interesting. There is an abysmal difference in the control you can have over the whole design, who did what, when and why, and just by reading the git log you get a fairly good idea of the whole design process. | |||
I would say something like this: Reviewing 3D files together, in real-time, maybe with the ability to mark or annotate them, would be tremendously useful. Modeling and committing changes to the model, however, should be a more carefully thought and undertaken process, with better control over each "step", what each of these steps contains being something that is decided by the designer, and not by an automated system. | |||
The way git handles binary files like FreeCAD files is indeed an issue. FreeCAD files are actually zip files, so they are handled as a binary blob by git. Meaning, on each commit the whole file is stored again. If you commit ten times a 10Mb file, your git folder stores 100Mb. | |||
This is very common as soon as you start working with non-text files. Even if the file format is text-based instead of binary, more than often, a simple change in a 3D file (moving a piece 10mm) creates many changes spread all over the file. | |||
Speaking strictly about FreeCAD (but it might apply to other formats too), there are however several possible paths to attack this: | |||
1) if you unzip a FreeCAD file, 90% of its contents is text. There are many projects out there whose purpose is to unzip files when committing to git, and rezip after pulling: https://forum.freecadweb.org/viewtopic.php?f=22&t=8688&start=20 This could help reducing drammatically the size of a git repo containing FreeCAD files (if you move a piece by 10mm, only a pretty small fraction of the file will change) but it will still be hard to get a human-readable summary of what has changed, which is another big advantage of working with git. | |||
2) I started working some time ago on a small script ( https://github.com/FreeCAD/FreeCAD/blob/master/src/Tools/fcinfo ) whose purpose is to print a "text representation" of a FreeCAD file. The idea is that you could use this script to produce human-understandable diffs between two FreeCAD files (ex: object "Piece" was at x=10mm in the old file, now it is at x=20mm). Git allows you all kind of tricks to use your own diff programs for specific filetypes. | |||
Basically this won't solve the main problem of "what ahppens when two people commit to the same file at the same time". No software I know of is able to work out this kind of problem with binary files. Unzipping a FreeCAD file might allow to do partial merge, though. But this needs some heavy testing, an will fail in many cases. The .brep files, for example, that store the shape of each file, although they are text files, often change drastically with a very simple change in the model, and will probably not be reconciliable/mergeable. | |||
But even so there are certainly huge progresses to be done, and I believe these zip-based compond formats like FreeCAD's (libroffice uses that system too, and nowadays, would you believe, even microsoft, docx and xlsx are also unzippable) are the best possible compromise between binary formats and text-based formats... | |||
I'd love to know how this question is going forward on your side, please keep me informed! |
Revision as of 17:21, 27 July 2019
Question
How does professional grade PLM software resolve file conflicts? And how does OSE achieve even higher performance using simple online tools? Here is an assessment of the state of art in each.
Context
Question to Yorik, core FreeCAD developer: OSE will launch a $250k Incentive Challenge on the HeroX platform on September 2020 to build an open source, pro grade 3d printed cordless drill. We expect 1000s of participants.
Regarding file conflict resolution - can you comment on what I know about this right now - link below - and what Is currently available in FreeCAD? We plan to simply use our wiki for FreeCAD file version history, with Annotated change log pictures uploaded manually to make design changes transparent - allowing thousands of people to contribute to he same design in near real-time.
See
What are your thoughts?
Problem Statement
PLM software is not designed for mass collaboration. Typical teams in industry resolve conflict by checking out a file and locking it down. This does not work for OSE because a checked out file means that nobody else can work on it concurrently.
The ideal solution is real time collaboration. Semi-realtime collaboration can occur when a person is online-connected to a repository, and FreeCAD downloads changes from other contributors on atimeframe of every 1 minute. This is undesirable, as collaborative waste would occur: people have to negotiate conflicting changes with one another, and any incompatible change must result in a fork. This can be resolved simply by starting a fork in the first place, and doing a pull request into the main branch later. This is the mechanism that GitHub uses.
However, the GitHub mechanism of high level conflict resolution does not work for OSE. This is because an entire repository must be cloned. Because files for CAD are so much larger, file storage memory limits this approach very quickly. For example, a single 10MB file (out of say 1 Gig total project size) - when forked by 1000 contributors - takes 1 TB to clone because the whole project must be cloned. This is memory prohibitive.
Potential Solutions
Version Control for PLM in FreeCAD is a complex, but solvable problem. There are many existing complex software layers that try to solve many of the issues. Most are open source. The difficulty is adapting them and putting them together in an easy to use software package.
Ideally, CAD data would be stored in an easy to difference and version or revision control format. A FreeCAD workbench that has an understandable and somewhat automated workflow might avoid using the archive format and put the XML small binaries into a modified git protocol folder with various tweaks like binary diff enabled so commits and branches don't create data copies. The workbench may need settings to help the user control branching and keep commits and differencing reasonable. As well as manage and even cull dead-end branches when needed. The ability to see files being worked on (checked-out) by other users would help enable communication and planning about versions before any possible conflicts are created. A voting system may also help manage decisions when dealing with large groups of contributors. Changing the FreeCAD file format or not using the archive format for collaboration may also enable more fine-grained control of versioning different types of data objects in different ways similar to other proprietary PLM CAD sharing platforms.
Git LFS is not ideal because it stores large (100MB-1GB) files outside the git repo with pointers so they are not differenced or version controlled the same way.
Controlling forks and preventing copies may be better-addressed mostly server-side. Much like with branches and differential compression there is no reason (except RAID & backups) to store multiples copies of the same data. A web-based git protocol implementation (Gitlab & Github) may have internal software solutions or rely on lower software and/or hardware layers such as versioning file systems and data deduplication.
Even a modified git protocol may not be ideal for CAD collaboration, but there are other revision and version control protocols that are also OSS. The primary reason this project has yet to be undertaken is likely the sheer size of it compared to other FreeCAD workbenches and a large amount of more immediately important fixes and additions to FreeCAD, which is still early beta software.
Related
Feedback
Yorik
Hi Marcin,
Hmm complex problem, lots of people discussing and argueing about it since long...
My two cents:
True real-time collaboration requires a complex, dedicated protocol where each and every "move" is registered. The only I know of was something developed by Eskil Steenberg for Blender (and an online game he did called "love") called Verse. It kind of worked. But there is a huge amount of data to be transmitted, so it failed quite easily. But you could really see other people push the vertices in real time...
While this is good for playing and "rough 3D sketching", it's not that interesting for more accurate design. Changes are so fast and so many that the "changes stack", or model history, becomes absurdly complex, and therefore unuseful. Also, speaking of experience, comparing working with dropbox-like solutions where each file save gets recorded, with working with git-like solutions, where commits are a decision of the user, with a meaningful message and a chosen set of file, I wouldn't hesitate one second to say the latter is far more interesting. There is an abysmal difference in the control you can have over the whole design, who did what, when and why, and just by reading the git log you get a fairly good idea of the whole design process.
I would say something like this: Reviewing 3D files together, in real-time, maybe with the ability to mark or annotate them, would be tremendously useful. Modeling and committing changes to the model, however, should be a more carefully thought and undertaken process, with better control over each "step", what each of these steps contains being something that is decided by the designer, and not by an automated system.
The way git handles binary files like FreeCAD files is indeed an issue. FreeCAD files are actually zip files, so they are handled as a binary blob by git. Meaning, on each commit the whole file is stored again. If you commit ten times a 10Mb file, your git folder stores 100Mb.
This is very common as soon as you start working with non-text files. Even if the file format is text-based instead of binary, more than often, a simple change in a 3D file (moving a piece 10mm) creates many changes spread all over the file.
Speaking strictly about FreeCAD (but it might apply to other formats too), there are however several possible paths to attack this:
1) if you unzip a FreeCAD file, 90% of its contents is text. There are many projects out there whose purpose is to unzip files when committing to git, and rezip after pulling: https://forum.freecadweb.org/viewtopic.php?f=22&t=8688&start=20 This could help reducing drammatically the size of a git repo containing FreeCAD files (if you move a piece by 10mm, only a pretty small fraction of the file will change) but it will still be hard to get a human-readable summary of what has changed, which is another big advantage of working with git.
2) I started working some time ago on a small script ( https://github.com/FreeCAD/FreeCAD/blob/master/src/Tools/fcinfo ) whose purpose is to print a "text representation" of a FreeCAD file. The idea is that you could use this script to produce human-understandable diffs between two FreeCAD files (ex: object "Piece" was at x=10mm in the old file, now it is at x=20mm). Git allows you all kind of tricks to use your own diff programs for specific filetypes.
Basically this won't solve the main problem of "what ahppens when two people commit to the same file at the same time". No software I know of is able to work out this kind of problem with binary files. Unzipping a FreeCAD file might allow to do partial merge, though. But this needs some heavy testing, an will fail in many cases. The .brep files, for example, that store the shape of each file, although they are text files, often change drastically with a very simple change in the model, and will probably not be reconciliable/mergeable.
But even so there are certainly huge progresses to be done, and I believe these zip-based compond formats like FreeCAD's (libroffice uses that system too, and nowadays, would you believe, even microsoft, docx and xlsx are also unzippable) are the best possible compromise between binary formats and text-based formats...
I'd love to know how this question is going forward on your side, please keep me informed!