S Venkataraman

S Venkataraman

Venkat has a background in biology, imaging and repositories and gained expertise in RDM. He then became the lead trainer at the DCC and is currently the Training Officer for OpenAIRE.

Location Edinburgh

Activity

  • From the perspective of choosing the "right" repository as a data depositor, of course, learning about certification is a valuable exercise - you want to be as certain as possible that the repository will safeguard your data for as long as possible.

  • This is a great question and one that doesn't have an easy answer! I'm afraid I don't have any concrete examples for you but I will say that the ideas behind open research/science play a big part here - self-policing by the community *should* reveal any misuse. The system is not perfect, of course, and there will be cases of misuse, but these appear to be...

  • S Venkataraman made a comment

    Thank you to everyone who provided responses in the Padlet exercises! :)

  • Hi Giordano, thanks for your comments and for the link to the ISO standard. Yes, you have to pay unfortunately to go any further. The ISO is considered the "gold standard" and costs a lot of money to be certified but is a consideration that only the repository can make whether it is worthwhile. For example, the University of Edinburgh's institutional...

  • Hi Liz, glad that you think so. Could you elaborate on why exactly you find it to be a great resource for you?

    Indeed, from the POV of a service provider, it will be iuseful to point researchers to suitable repositories for their data and if that can be fulfilled then many other considerations could be potentially answered for the researcher such as what...

  • Hi everyone, welcome to Week 4. I'm Venkat, one of the team that is moderating these comments threads, and will be leading this week. Feel free to start a conversation on this week's topic of repositories and I'll get back to you. Hope you've all been finding the course valuable thus far! :)

  • Good question, and I'm not sure entirely what the answer would be I'm afraid. What you outline is true, and we encourage you to take the RISE exercise in collaboration with all parties that would be directly involved in the building of services (including researchers who can provide insight intowhat services might be required), and ideally this should be done...

  • Completely understand your concern here! This is a problem that has been identified before and one we hope we can rectify in the future - that there is not enough granularity in the RISE tool. In this case I would recommend that you choose the option that at least matches one of the services you currently have available. You shouls make a note of this for your...

  • @JohnBosco Do you mean who should conduct the gap analysis in your institution (or any given institution)? If so, we recommend that the exercise is done collaboratively by a mixture of IT, library and research staff. Maybe also research office staff. Together, they will provide the varied viewpoints necessary and the knowledge of the services currently present...

  • @SarahDellmann Indeed, this might be an anomaly compared to how things are done now. University of Edinburgh was one of the earlier adopters/implementers of RDM services (at least in the UK) and therefore they did not have as much prior examples to build upon. There was definitely trial and error.

  • A very valid question but unfortunately there are valid reasons too - that there is disparity with more developed countries, not least due to financial considerations. However, there are still "free" (e.g. Google Drive) and open source options available that should be considered and which could be implemented easily given enough time and effort.

  • Agreed! Please see my reply below to Anne...

  • Absolutely. A similar experience at the University of Edinburgh - you can spend millions on building services but if no one knows about them then they will not be used! Therefore, a single point of entry to all the services was created and also people were recuited to new roles that were created to help researchers find suitable services for their work. See...

  • Small correction: we are discussing "open data" and not "open source data". "Open source code" is separate but you are correct that it can also be treated as data (since it is a digital object too). "Open source" refers to the ability for third parties to modify the code without restrictions.

  • Valid comment and yes this version of the lifecycle may not be the best fit for your circumstances. There are many versions but the one presented here is very simplistic - mainly for the purposes of teaching.

    You can also see the DCC's curation lifecycle which is far more complex!: https://www.dcc.ac.uk/guidance/curation-lifecycle-model

  • Nice comment and I share your frustration on trying to encourage better RDM practices by researchers. Indeed, this is a common issue and should not just be about funder or institutional mandates but should also be about researchers actually being able to understand the benefits. There are ways that this can be done such as showing increased citation rates open...

  • The definitions that you refer to can cause confusion to many people so don't worry! Yes, a repository is essentially a website that houses data. We'll discuss these more later but researchers deposit their data in these repositories with the intention of long term *preservation*. In this case, preseravtion means that the data has been stored in such a way...

  • I somewhat see what you mean here regarding the definition of "sharing". It can be misconstrued but essentially we are talking about licensing of data. e.g. use of Creative Commons licences which should be considered before "publication" in this particular workflow. Licensing is a prerequisite to sharing and defines *how* the data may be shared, if at all.

  • Please also think about this lifecycle as something aspirational! I agree that not everywhere will have these things in place - or may never have them due to financial constraints, for example. We'll discuss later.

  • I completely agree that this lifecycle is very idealised! Indeed, there can be, and most likely there will be, deviations from this in any given research project.

    As for the planning phase, yes, this can be an issue where researchers do not see the benefits of such things as DMPs. Increasingly, there are mandates from funders and institutions for...

  • Thanks for your observations - I agree! Please see what was done here at the University of Edinburgh for an example of what you described of processes vs phases in research: https://www.digitalresearchservices.ed.ac.uk/

    (Mouse over each "phase" in the lifecycle to show more info and click to show a list of services.)

    I don't expect you to understand the...

  • Hi Anne, thanks for your observation. Is there somewhere that we can see the lifecycle that you use? Or could you describe it further here please? The lifecycle we show here is indeed very generalised and is only one example of many depending on context.

  • This is a very ineteresting example - thanks for sharing! Do you foresee any way in which you could identify artists by piecing together different descriptors? I appreciate this could be a challenge that might not be fully achievable due to the age of some of the art.

  • Trust is an important consideration and this is directly related to the level of curation that any given dataset undergoes. These are indeed subjective but there are various ways that some trust could be achieved. One example is adhering to minimum information standards: https://fairsharing.org/collection/MIBBI where metadata are described as richly as...

  • A warm welcome to everyone - I'm Venkat, one of the educators on this MOOC. I work for the Digital Curation Centre (DCC) in Edinburgh, UK and before joining the research data managemnt (RDM) field I was a biologist. I will be moderating these threads along with Ellen, Rene and Sarah from the Netherlands and Alex and Ryan here in the UK. Together, hopefully we...

  • @RaoulTeeuwen Thank you for your analysis and reporting on the broken links.

  • @ElizeE May I please ask why you say that "most common type of licence that researchers in my institution would select is:
    'Creative Commons Attribution-NonCommercial-NoDerivs (CC-BY-NC-ND)"? Is this because most research in your institution is of a sensitive or IPR nature? Otherwise, there should not be a need to make the data so restrictive.

  • @YanGrange When we talk about embargoes, we mean that during this time the data is not visible to the outside world. The data has been deposited in a repository but cannot be found yet. Once the embargo period ends it will be released. At this point, whichever licence has been applied will take effect.

  • @JudithBrands One trick that you could try for data specific to your research project, is to think backwards: try to identify a suitable repository for you data from the start. If you are successful in identifying a domain/subject specific repository, investigate which metadata standard that repository uses - this information is usually easily findable. It is...

  • @LolkeBoonstra This could potentially be true, but the onus is on the researcher and their superiors to make sure that this does not happen. And assuming there is a RDM plocy in place. Here, we are talking exclusively about the "active" phase of research and any data that is not finalised. Many research institutes have similar tools to SURFdrive that allow...

  • @SoileManninen Thank you for sharing these.

  • @YanGrange Excellent point and completely agree that source code and software should be considered as data in itself. Indeed, tools such as EUDAT's licence selector: https://ufal.github.io/public-license-selector/ allow you to specifiy whether you are wanting to license such data.

  • @ElizeE DMPonline and the LIBER DMP catalogue are complementary. the former provides a tool for easily writing a DMP which can (very importantly) be done collaboratively and across borders. It also provides templates and guidance notes. The LIBER DMP catalogue could be a source of inspriation to write your own DMP especially if you can find one that is from...

  • @KateK Thanks for this comment. Could you please elaborate on what gap analysis you did please? Was it the RISE analysis or something else?

  • @ElizeE I apologise - yes, when it comes to health data "persuasion" is not the word or course of action to take. However, it is likely that the vast majority of cases where they would decline permission is based on misinformation (maybe from MSM) and/or misunderstanding of what sharing is and explanation of this to assuage fears should at least be attempted.

  • @BurcuOrtakaya You could consider yourself lucky that you have a blank canvas to work with - so many options without limits! :)

  • @BurcuOrtakaya This should have been made clearer, and apologies that it wasn't, but you do not need to select any of the options when answering the questions in RISE - this is by default "Level 0".

  • Regarding not getting permission from a subject to share data: indeed, I'm afraid you are correct that you would have to abide by that restriction. However, there is nothing stopping you from persuading the subject of the benefits of allowing sharing and also from reassuring them of safeguards against misuse, etc. I hope you have found some examples of...

  • I'm not sure that I fully understand what you have written, but to possibly answer the first part: the video clips can be defined as data. This is a common issue where many researchers overlook that some things that they work with, whether digital or not, could be considered to be data. Therefore, CC licensing, which is the most common licensing method for...

  • @ClaireE This is not uncommon - I have recently been involved in quite a few workshops where I have introduced the RISE tool and most (maybe all!) showed that the respective insitutions were at Level 1. This unsurprising since the concepts of best practices in RDM, FAIR and open research are still, I would argue, in their infancy. That insititutions are...

  • @AnjaM This is a valid criticism of the current version of the tool and one that I believe will be address eventually. Although not explicitly said, it is possible to not pick any of the options listed which equates to "Level 0" - services that don't exist (yet). Apologies that this is not made clear.
    However, I'm glad that the German version you reference...

  • @JudithBrands I should have also added a link to the schools that we help run in LMICs if you are inetersted in what we do:

    https://codata.org/initiatives/strategic-programme/research-data-science-summer-schools/

  • @JudithBrands No problem - I completely understand your reservations! :)

    I too come from a research (life sciences) background and in my role now in RDM I see the wider picture. We travel extensively to provide training in many countries internationally and one such region is low and middle income countries (LMICs) such as those in Africa. Your stated...

  • Thanks for this. And please also see the comment from Sarah directly below with links to an IDCC workshop that covered this. :)

  • @BertHuizing This is certainly true, but I can say from anectodotal evidence that it can still be a challenge if these services are not as user friendly as established third party options such as Google Drive, etc. The size of the institution, its research income and its priorities all need to be taken into consideration too before embarking on the costly road...

  • @DilettaBeconcini This can be an issue sometimes. Have you tried contacting the data owners (creators) directly, if known and if this is feasible, to ask for permission to reuse their data?

  • I completely agree with your sentiments but we need to distinguish good RDM (which should *mainly* be the responsibility of the reseracher) and good RDM services (and policies), which, as you say, should be the responsibility of universities and institutions.

  • I should also add that please remember, unless this is not the case for you, much of the reserach in universities and other institutions comes from public money and/or charities and it is a necessity that research outputs from such work should be visible to not just other researchers but society as a whole. It is their data as much as the researchers'.

  • To follow on from Boudewijn's comments:
    1) this is a common concern and effort required by reserachers should be minimised as much as possible, of course. However, the onus is on the researcher to ensure reproducibility of their research and to ensure research integrity. Ultimately, this is what good RDM practices, including FAIR, aim to achieve. A key aspect...

  • I think this is an excellent point and one which needs to be stressed further. However, before doing so, please keep in mind that the curation lifecycle model that we use in this course is just one of many variations, and there may be one better suited to your specific needs. The one used here is fairly general to accommodate the scope of this...

  • @JulienColomb Thank you for flagging this problem. Indeed it is something we are going to think about to help facilitate further discussion beyond the course run - what other platforms can be used to allow discussion. The comments will close on 15th Nov I believe in this case. In the meantime, we suggest that participants use the Twitter hashtag #RDMservices...

  • @LindaWagner Great comments and sincerest apologies for the broken links - I'll pass on the details of those to get them fixed.

  • @MariaVivas-Romero Indeed, it is very case specific, but typically "open" data will fall into either CC0 or CC-BY, and possibly CC-BY-SA. Is there any particular reason you would also have to apply ND?

  • I completely agree Linda! CC-BY is as restrictive as it should be when making data "open".

  • Yes, this is a source of confusion and one that is common! I think the real difference between CC0 and Public Domain hasn't been properly described here so let me try: the former is an *active* waiver of all rights by the owner, granting it to the public domain. The latter occurs when something can have no known rights, either because copyright has expired or...