This study's main idea was to explore research fronts in agricultural science for insights into the future development in agricultural technology. Through science mapping results, the members of governmental organizations (such as policymakers, technology managers, and research departments) can be aware of the emerging technology trends and arrange the resources to deal with important issues for sustainable agriculture. The study applied digital tools to understand research fronts of agricultural science and technology from pioneering agricultural journals between 2009 and 2019 by informatic techniques. Text mining and bibliometric analysis were used to reveal the important research topics and their evolution pattern based on scientific publications. And the bibliometric data of each literature were used for research mapping by statistical analysis, including title, keyword, author, publication date, and journal names. Consequently, 205 keywords were selected as popular topics because the author simultaneously mentioned keywords in all five journals. Moreover, ten major disciplines with 205 topics in agricultural research trends, such as agricultural multi-disciplines, intelligent agriculture, plant species, agricultural management, soil, crop phenotypes, environment, plant physiology, ecology, pest, and disease, were classified depending on the semantic definition of keywords. The knowledge structure of ten major disciplines was further constructed by those 205 popular keywords, which could help identify more emerging topics and insight into the new research directions. To evaluate each topic's trends from 2009 to 2019, statistical analysis was performed by counting the number of keywords each year. The results showed that sustainable agriculture, conservation agriculture, precision agriculture, organic agriculture, and food security were the leading scientific area issues. In short, the combination of data mining and bibliometric approach was performed in this study, which not only annotated the multidisciplinary issues of agriculture for having focused on the main thematic areas but also provided the insight information for discovering potential technologies in the agricultural sector.
Keywords: Data mining, text mining, textual analysis, bibliometrics, scientometrics, knowledge management
Agriculture is the foundation of the dominant economic sectors for food production industries. The vital role of agricultural science and technology has been attributed to worldwide food security and maintaining the resources and landscapes of nature, such as water, soil, and biodiversity in the perspective of ecology. Due to the rapid growth of the global population, the agriculture sector must seek for more efficient and innovative ways to produce foods. The development of science and technologies has become an important solution. For understanding the trends of agricultural research issues and discover emerging technologies, bibliometric analysis has become a valuable tool for science mapping. Since 2019, the Council of Agriculture, Executive Yuan, R.O.C.(Taiwan) started to support the bibliometric analysis projects to identify the "hot topics" in agricultural science. In this research, an international database, Web of Science® (WoS®), was used to explore scientific publications (Bojović et al., 2013). The data mining analysis was performed through a web crawler program called "AgriAnalytics" for collecting bibliometric information such as authors, affiliations, title, abstract, keywords, and published date. Finally, bibliometric analysis was subsequently used for getting research front results. In sum, the combination of conventional analysis methods with data mining techniques provides a new approach for discovering the insight of agricultural issues and assessing technology trends.
The bibliometric results were extracted from the WoS® database from 2009 to 2019. First, the term "agriculture" was used to be the search condition in the keyword field, and the journals with high coverage rates in the top ten were selected for the following analysis. The journals with impact factor (IF) over two were used for journal selection to make sure the quality of reference (Bojović et al., 2013). Consequently, five pioneer journals representing different agriculture domains were chosen for bibliometric analysis according to the categories of the journal ranking website, SCImago Journal & Country Rank . These journals include Proceedings of the National Academy of Sciences of the United States of America (PNAS), Field Crop Research, Frontier in Plant Science, Agriculture Ecosystems Environment, and Computers and Electronics in Agriculture.
The data mining technique is not only used to identify the important issues in multidisciplinary science; it's also applied for collecting bibliometric information from the websites of pioneer journals, including authors, affiliations, titles, abstracts, keywords, and published dates. In this study, the textual data from each literature's web page was collected using a widely-used programming language, Python. After processing and standardization for the raw data, the data set was saved into an online database called AgriAnalytics for further analysis.
Totally 99,326 articles published from 2009 to 2019 on five indicator journals were collected by a website platform "AgriAnalytics." The textual data containing bibliometric information was analyzed for science mapping following the analysis of the keywords (Rodriguez‐Ledesma et al., 2015). The distribution of keywords in five pioneer journals was examined by cross-comparison. There are 116,108 keywords extracted from the cross-comparison, and 205 of them appeared in five pioneer journals simultaneously. In other words, these keywords seem to be regarded as important topics by scientists in the period of 2009 to 2019 (Figure 1).
Hot topics in agriculture from 2009 to 2019
The 205 keywords were subsequently grouped by domain experts based on semantic knowledge into 11 heterogeneous and multidisciplinary issues, for example, agriculture, intelligent agriculture, plant species, management, soil, environment, phenotypes, plant physiology, ecology, pest and disease, and others. The group named for "others" means that the keywords are not the typical words in the agricultural technology research community, such as net income, smallholders, and farmers. Generally speaking, technological topics are usually associated with food production issues, such as increasing crop yield, improving quality, and enhancing management efficiency. However, the bibliometric data suggests that agricultural economics topics are also highly discussed in the technology community. For instance, it has been demonstrated that the automatic irrigation project based on weather forecast might enhance farmer's net income (Hassan et al., 2018). Another reported case showing the strong connection of technology with the economy found that rhizobium inoculants' application to soybean planting in Nigeria would increase farmers' economic benefits. (E. Ronner et al.,2016). In our result, over 50% papers with the ‘farmer’ keyword were studied in African regions, such as sub-saharan Africa, Kenya, northern Ghana, Zimbabwe, west Africa, and eastern Tanzania. The main technology mentioned in those papers would be breeding, crop management, soil management, or weed management, and the research aim was mostly focused on smallholders and sustainable agriculture. In short, the technology topics are not only related to field management but also to Africa rural areas and smallholders (Figure 2).
Knowledge structure for intelligent agriculture
The intelligent agriculture Issue includes more than 30 of 205 topics, so we chose it to be the priority issue in this study (Figure 2). Statistics analysis for the journal, Computers and Electronics in Agriculture, from 2009 to 2019 were performed to evaluate each grouping's influence. As a result, 8,207 keywords are collected in this journal. And the keywords that appeared over nine times are selected as hot topics and grouped carefully by their meanings (Figure 3). The data shows that the number of "precision agriculture" is the highest, and the result elucidated the concept of intelligent agriculture mainly contained precision agriculture technology. Intriguingly, precision livestock farming and animal welfare are also highly mentioned in the issue of agriculture. This result also suggests that the application of precision agriculture possibly happened in the animal industry. Furthermore, the second high count of number is the keyword, machine vision (including computer vision), which belongs to the image technology theme. The outcome from statistical analysis in both themes, the image technology and algorithm, can correlate to each other well because various algorithms are necessary for the extension application of computer vision techniques.
Bibliometrics studies construct the knowledge structure to draw the picture of intelligent agriculture from keyword analysis results. The data provides the scientometric information of intelligent agriculture. For example, remote sensing technology, algorithm, modeling, and image technology, are the major technology in this theme. Besides, the purposes of developing intelligent agriculture focus on decision supporting and increasing efficiency of farming management. The potential targets of the food production industry are corn, wheat, cotton, pigs, apples, potatoes, tomatoes, and other high-valued crops or animals. The number of keywords in the theme of remote and sensing technology indicates that the remoting devices and sensing technology would be ready to apply to the industry for farming equipment.
Trends for emerging technology in agriculture
The issue of "management" was further identified because there were many hot topics in this issue, and the keyword "conservation agriculture" was also identified in the list of hot topics. The bibliometrics data from the pioneer journals, Agriculture Ecosystems Environment, was examined by keyword statistics analysis to reveal the trends of varied farming managements. The “hot” keywords extracted in this study also include pesticide(s), herbicide(s), integrated pest management (I.P.M.), integrated weed management, pest management (pest control), and weed management. According to the results, the data shows that "pesticide(s)" is the main topic in management issues; however, the increasing number of integrated pest management (I.P.M.) and pest management or pest control rose after 2016. Besides, the herbicide(s) topic was relatively aware compared to weed management, but the pattern turned into the opposite after 2018. These data indicate that environmentally-friendly management has become more significant than conventional farming practices. (Figure 4)
On the other hand, the phenotype issue has received a lot of attention in the agriculture research community. Because the foundation of phenotyping is the breeding selection process, the topics representing common breeding technology were analyzed by keyword statistics (Costa et al., 2018). The bibliometrics information was collected from the journal of Field Crop Research from 2010 to 2019. The data showed that the main topics were QTL, quantitative trait locus (loci), GWAS, genome-wide association study, SNP(s), and single-nucleotide polymorphism(s). Interestingly, GWAS has been the major breeding technology after 2018, which may be due to molecular biotechnology becoming affordable and easier to access.
In recent years, science mapping has become a popular methodology for governmental organizations and global enterprises to discover the structural and dynamics aspect of technology trends (Tatry et al., 2014). The increasing accessibility to information and technology techniques, using text mining as digital tools to discover emerging topics for the policymakers, also turns into a straightforward method (Elizabeth et al., 2020). The value of bibliometrics data from numerous official documents provides knowledge awareness of the agricultural scientific community.
Generally speaking, it usually takes five to ten years for science results to develop into application in the industry. However, the research papers can be used for mapping analysis to estimate the potential of technology for industrial application. Moreover, the keywords in the pioneer journals could help technology managers to build knowledge structures for significant issues, especially for multidisciplinary issues, which also help the academic focus on some topics that need extension or improvement. (Natale et al., 2012).
In this study, the data mining was performed by a web crawler based on an open-source programming language, Python. The Bibliometrics analysis results revealed the crucial issues in agriculture science and technology. Moreover, the hot topics are identified through keyword analysis that reflects the knowledge structure from the scientific community network's view. These results could be the reference for the Council of Agriculture, Executive Yuan, R.O.C. (Taiwan) to promote industrial competitiveness through developing innovative technologies. For instance, the knowledge structure of intelligent agriculture can help re-examine the outcome of technology policy in farming enterprises and make sure those hot topics will be adopted in industrial applications. The trend analysis results may also help policymakers rearrange governmental resources to advance farming techniques to face the competition in the global perspective.
To understand the agricultural technology trends from 2009 to 2019, the integrated science mapping method with data mining and bibliometrics analysis was performed in the study. Five pioneer journals representing different research areas, for example, multidisciplinary, agriculture and biological science, environmental science, and computer science were selected for insights into the hot topics. Overall, there are 205 topics and 10 issues after keyword analysis which are considered as the major trends in technology development in the future. The hot issues of intelligent agriculture, management, and phenotype are subsequently analyzed by statistic methods. The knowledge structures for each issue are constructed which provides the information and technology innovation departments a clear and complete scope for agricultural innovation. In summary, the combination method of data mining and bibliometrics provides an overview of the global agricultural science area's current status, which may help the policymakers and the researcher insight into the future of farming technology for sustinable agriculture.
Bojović, S., Matić, R., and Popović, Z. (2014). An overview of forestry journals in the period 2006–2010 as basis for ascertaining research trends. Scientometrics 98, 1331–1346 https://doi.org/10.1007/s11192-013-1171-9.
Costa, C., Schurr, U., Loreto, F., Menesatti, P., & Carpentier, S. (2019). Plant Phenotyping Research Trends, a Science Mapping Approach. Frontiers in plant science, 9, 1933. https://doi.org/10.3389/fpls.2018.01933.
Elizabeth Arnaud, Marie-Angélique Laporte, Soonho Kim, Céline Aubert, Sabina Leonelli, Berta Miro, Laurel Cooper, Pankaj Jaiswal, Gideon Kruseman, Rosemary Shrestha, Pier Luigi Buttigieg, Christopher J. Mungall, Julian Pietragalla, Afolabi Agbona, Jacqueline Muliro, Jeffrey Detras, Vilma Hualla, Abhishek Rathore, Roma Rani Das, Ibnou Dieng, Guillaume Bauchet, Naama Menda, Cyril Pommier, Felix Shaw, David Lyon, Leroy Mwanzia, Henry Juarez, Enrico Bonaiuti, Brian Chiputwa, Olatunbosun Obileye, Sandrine Auzoux, Esther Dzalé Yeumo, Lukas A. Mueller, Kevin Silverstein, Alexandra Lafargue, Erick Antezana, Medha Devare, Brian King. (2020). The Ontologies Community of Practice: A CGIAR Initiative for Big Data in Agrifood Systems, Patterns 1 Issue 7. https://doi.org/10.1016/j.patter.2020.100105.
E. Ronner, A.C. Franke, B. Vanlauwe, M. Dianda, E. Edeh, B. Ukem, A. Bala, J. van Heerwaarden, and K.E. Giller. (2016). Understanding variability in soybean yield and response to P-fertilizer and rhizobium inoculants on farmers’ fields in northern Nigeria. Field Crops Research, 186: 133-145. https://doi.org/10.1016/j.fcr.2015.10.023.
Hassan M. Abd El Baki, Haruyuki Fujimaki, Ieyasu Tokumoto, and Tadaomi Saito. (2018). A new scheme to optimize irrigation depth using a numerical model of crop response to irrigation and quantitative weather forecasts. Computers and Electronics in Agriculturez, 150: 387-393. https://doi.org/10.1016/j.compag.2018.05.016.
Natale, F., Fiore, G. & Hofherr, J. Mapping the research on aquaculture. A bibliometric analysis of aquaculture literature. (2012). Scientometrics 90, 983–999. https://doi.org/10.1007/s11192-011-0562-z.
Rodriguez‐Ledesma, A., Cobo, M., Lopez‐Pujalte, C. and Herrera‐Viedma, E. (2015), An overview of animal science research 1945–2011 through science mapping analysis. Journal of Animal Breeding Genetics, 132: 475-497. https://doi.org/10.1111/jbg.12124.
SCImago, (n.d.). SJR — SCImago Journal & Country Rank. Retrieve, from http://www.scimagojr.com.
Tatry, MV., Fournier, D., Jeannequin, B. et al. EU27 and USA leadership in fruit and vegetable research: a bibliometric study from 2000 to 2009. (2014). Scientometrics 98, 2207–2222. https://doi.org/10.1007/s11192-013-1160-z.