The rough road of data sharing well needs pattern-matching sharing models and operating mechanisms. Based on the current state of scientific data sharing in big data era,this paper applies compound methods of literature review,theoretical transplantation,and scenario analysis. The progress of scientific data sharing model study has been summarized and research limitations have been pointed out. Furthermore,reanalysis from the view of new institutional economics has been suggested and the design of model abstraction combining with real case scenario analysis has been proposed.
For literature review,summarization is carried out by a micro-level perspective regarding research data themselves,a meso-level view focusing on operating organizations together with macro-level insights into institution. Then research data sharing activities and sharing models have been described and models in some specific disciplines as well as in other information resources fields have also been mentioned.
Moreover,theories of new institutional economics have been introduced. Judging from the data asset specificity and data trading frequency,different research data sharing activities tend to select different organizational forms. Concerning those preferred organizational arrangements,there are mainly five kinds of data sharing models,including intra-organization level data sharing model,controlled data sharing model,intermediate-organization level data sharing model,individual data exchange model,and free market data sharing model.
Upon the abstraction of five logical models,scenario analysis represents with mainstream data sharing cases including data resources pool model,data publishing model and data market model. Following the guidance of supply-demand chain within data sharing models,stakeholders and their interactions tracing the data flow are reviewed. Meanwhile,comparisons about advantages,disadvantages,opportunities and challenges within the three mainstream sharing models for open data are fully analyzed and the prospect for the future development is also discussed.
Finally,we find out that data resources pool model is the most important way for scientific data sharing and its predominance will prevail for quite a long time in the future. However,due to the externality source of incentives and top-down pattern of data sharing path,data resources pool model cant help the data suppliers exert their subjective initiative to the utmost. Therefore,the cost for supervision and evaluation of such model is high. Besides,the data publishing model successfully facilitates the work of data sharing and helps data suppliers gain their scholarly reputations as well. However,data publishing model still needs further development in many aspects,such as the establishment of scholarly reputation and acknowledgement,sustainability of business model,balance of open data and copyrights protection and etc. In a way,data publishing model may become the real mainstream in the future scientific research community. What's more,the introduction of free market exchange makes the data market model a plus for data sharing model selection. While the hinders of such model include the high-tech challenges from the high specificity of data assets and the difficult negotiation between public funding data production and added value data trading.
Above all,throughout the overall development of scientific data sharing,model study shall continue forward. The mainstream data sharing models are bound to complement one another and jointly promote the scientific research by sharing data within but not limited to those existing models. Meanwhile,the inherent data sharing path dependence reshapes the real practices of ideal data sharing model to some extent. Besides,data sharing is inevitably subject to the existing framework of local culture and customs of research community. The reforms of data sharing models are more likely to take a gentle way step by step. Facing the current data sharing model arrangement,if the dominant data sharing model namely the data resources pool model could absorb more from others according to particular data attributes,open data can make more progress. 5 tabs. 34 refs.