The Rise of Curation on GitHub

نویسندگان

  • Yu Wu
  • Jessica Kropczynski
  • Raquel Prates
  • John M. Carroll
چکیده

Recently, curation practices start to develop in GitHub, where developers systematically put efforts to select, evaluate, and organize existing artifacts for the purposes of preservation and future use in software development. Curation practices in social media sites, such as Twitter and Pinterest, have been investigated, raising questions about the nature of collaborative curation in a professional/productoriented site. In this study, we identify and characterize curation projects hosted on GitHub, and compare curation projects with software projects to study how this practice takes place and how it is different from the original use of GitHub. We find that curation has emerged as a highly popular category of GitHub project, which is directed to learning and professional development, and curation practice leverages collaborative tools and practices native to GitHub. Although curation projects and software projects use the same set of activities for development, they are different from each other in terms of the quantity of each type of activity performed by developers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study of the foundation, models and issues of research data curation and management in scientific and academic environments

Background and Aim: The purpose of this paper is to study, identifying and discuss the foundation and concepts, models and frameworks, dimensions and challenges of research data curation and management in scientific and academic environments. Method: This article is a review article and library method was used to collect scientific and research texts in this field. In this research, external an...

متن کامل

On Automating Basic Data Curation Tasks

Big data analytics is firmly recognized as a strategic priority for modern enterprises. At the heart of big data analytics lies the data curation process, consists of tasks that transform raw data (unstructured, semi-structured and structured data sources) into curated data, i.e. contextualized data and knowledge that is maintained and made available for use by end-users and applications. To ac...

متن کامل

Phylesystem: a git-based data store for community-curated phylogenetic estimates

MOTIVATION Phylogenetic estimates from published studies can be archived using general platforms like Dryad (Vision, 2010) or TreeBASE (Sanderson et al., 1994). Such services fulfill a crucial role in ensuring transparency and reproducibility in phylogenetic research. However, digital tree data files often require some editing (e.g. rerooting) to improve the accuracy and reusability of the phyl...

متن کامل

The Sol Genomics Network (solgenomics.net): growing tomatoes using Perl

The Sol Genomics Network (SGN; http://solgenomics.net/) is a clade-oriented database (COD) containing biological data for species in the Solanaceae and their close relatives, with data types ranging from chromosomes and genes to phenotypes and accessions. SGN hosts several genome maps and sequences, including a pre-release of the tomato (Solanum lycopersicum cv Heinz 1706) reference genome. A n...

متن کامل

The Open Spectral Database: an open platform for sharing and searching spectral data

BACKGROUND A number of websites make available spectral data for download (typically as JCAMP-DX text files) and one (ChemSpider) that also allows users to contribute spectral files. As a result, searching and retrieving such spectral data can be time consuming, and difficult to reuse if the data is compressed in the JCAMP-DX file. What is needed is a single resource that allows submission of J...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015