Analysis of PLM Software Conflict Resolution: Difference between revisions

From Open Source Ecology
Jump to navigation Jump to search
No edit summary
Line 7: Line 7:


However, the GitHub mechanism of high level conflict resolution does not work for OSE. This is because an entire repository must be cloned. Because files for CAD are so much larger, file storage memory limits this approach very quickly. For example, a single 10MB file (out of say 1 Gig total project size) - when forked by 1000 contributors - takes 1 TB to clone because the whole project must be cloned. This is memory prohibitive.
However, the GitHub mechanism of high level conflict resolution does not work for OSE. This is because an entire repository must be cloned. Because files for CAD are so much larger, file storage memory limits this approach very quickly. For example, a single 10MB file (out of say 1 Gig total project size) - when forked by 1000 contributors - takes 1 TB to clone because the whole project must be cloned. This is memory prohibitive.
=Potential Solutions=
Version Control for PLM in FreeCAD is a complex, but solvable problem. There are many existing complex software layers that try to solve many of the issues. Most are open source. The difficulty is putting adapting them and putting them together in an easy to use software package.
Ideally, CAD data would be stored in an easy to difference and version or revision control format. A FreeCAD workbench that has an understandable and somewhat automated workflow might avoid using the archive format and put the XML small binaries into a modified git protocol folder with various tweaks like binary diff enabled so commits and branches don't create data copies. The workbench may need settings to help the user control branching and keep commits and differencing reasonable. As well as manage and even cull dead-end branches when needed. The ability to see files being worked on (checked-out) by other users would help enable communication and planning about versions before any possible conflicts are created. A voting system may also help manage decisions when dealing with large groups of contributors. Changing the FreeCAD file format or not using the archive format for collaboration may also enable more fine-grained control of versioning different types of data objects in different ways similar to other proprietary PLM CAD sharing platforms.
*https://stackoverflow.com/questions/4697216/is-git-good-with-binary-files
Git LFS is not ideal because it stores large (100MB-1GB) files outside the git repo with pointers so they are not differenced or version controlled the same way.
*https://docs.gitlab.com/ee/workflow/lfs/manage_large_binaries_with_git_lfs.html#how-it-works
*https://blog.grabcad.com/blog/2013/10/04/dropbox-for-cad-sharing-is-a-mistake/
Controlling forks and preventing copies may be better-addressed mostly server-side. Much like with branches and differential compression there is no reason (except RAID & backups) to store multiples copies of the same data. A web-based git protocol implementation (Gitlab & Github) may have internal software solutions or rely on lower sofware and/or hardware layers such as versioning file systems and data deduplication.
*https://docs.gitlab.com/ee/administration/repository_storage_types.html
*https://gitlab.com/gitlab-org/gitlab-ce/issues/23029
*https://en.wikipedia.org/wiki/Versioning_file_system
*https://en.wikipedia.org/wiki/Data_deduplication

Revision as of 21:57, 24 July 2019

How does professional grade PLM software resolve file conflicts? And how does OSE achieve even higher performance using simple online tools? Here is an assessment of the state of art in each.

Problem Statement

PLM software is not designed for mass collaboration. Typical teams in industry resolve conflict by checking out a file and locking it down. This does not work for OSE because a checked out file means that nobody else can work on it concurrently.

The ideal solution is real time collaboration. Semi-realtime collaboration can occur when a person is online-connected to a repository, and FreeCAD downloads changes from other contributors on atimeframe of every 1 minute. This is undesirable, as collaborative waste would occur: people have to negotiate conflicting changes with one another, and any incompatible change must result in a fork. This can be resolved simply by starting a fork in the first place, and doing a pull request into the main branch later. This is the mechanism that GitHub uses.

However, the GitHub mechanism of high level conflict resolution does not work for OSE. This is because an entire repository must be cloned. Because files for CAD are so much larger, file storage memory limits this approach very quickly. For example, a single 10MB file (out of say 1 Gig total project size) - when forked by 1000 contributors - takes 1 TB to clone because the whole project must be cloned. This is memory prohibitive.

Potential Solutions

Version Control for PLM in FreeCAD is a complex, but solvable problem. There are many existing complex software layers that try to solve many of the issues. Most are open source. The difficulty is putting adapting them and putting them together in an easy to use software package.

Ideally, CAD data would be stored in an easy to difference and version or revision control format. A FreeCAD workbench that has an understandable and somewhat automated workflow might avoid using the archive format and put the XML small binaries into a modified git protocol folder with various tweaks like binary diff enabled so commits and branches don't create data copies. The workbench may need settings to help the user control branching and keep commits and differencing reasonable. As well as manage and even cull dead-end branches when needed. The ability to see files being worked on (checked-out) by other users would help enable communication and planning about versions before any possible conflicts are created. A voting system may also help manage decisions when dealing with large groups of contributors. Changing the FreeCAD file format or not using the archive format for collaboration may also enable more fine-grained control of versioning different types of data objects in different ways similar to other proprietary PLM CAD sharing platforms.

Git LFS is not ideal because it stores large (100MB-1GB) files outside the git repo with pointers so they are not differenced or version controlled the same way.


Controlling forks and preventing copies may be better-addressed mostly server-side. Much like with branches and differential compression there is no reason (except RAID & backups) to store multiples copies of the same data. A web-based git protocol implementation (Gitlab & Github) may have internal software solutions or rely on lower sofware and/or hardware layers such as versioning file systems and data deduplication.