Add reviewer latter

This commit is contained in:
Aleksey Filippov 2023-06-17 01:04:18 +04:00
parent adc1f8b9c3
commit 0cf11b9dd5

65
letter.txt Normal file
View File

@ -0,0 +1,65 @@
We would like to thank the reviewer for his/her comments. We hope we could improve the article.
====
Q: The concept of "project" is not formalized. Is it a set of source code files or source code + configuration (or something else).
A: We have added the following description:
We represent software system projects as a set of source code files. The source code of the software system is the main data source for structural features identifying in the proposed algorithm.
We formed an AST to analyze the source code. There are various libraries and tools for all existing programming languages for the AST formation. We use own representation of the AST to add support for new programming languages without changing the analysis algorithms. Therefore, we need to develop a converter that transforms the AST generated by the parser for some programming language into our AST representation.
For example, we analyze Java files and their hierarchy at the package level for the Java-based software systems. We use the JavaParser library to form an AST for Java projects. The algorithm considered below allows us to transform the AST, which is generated by the JavaParser library, into our AST representation.
====
Q: Non-latin captions in the figures must be eliminated.
A: We have added the English version of this figure.
====
Q: Algorithms must be formalized more strictly (it is better to separate them into tables)
A: We have added the additional examples. See Table 1.
====
Q: Figure with the source code has no sense.
A: We have replaced the figure with the source code by the listing.
====
Q: Neo4j, Cypher-query and apoc.util.md5 are not revealed
A: We have added the following descriptions.
We use the Neo4j \cite{ref_neo4j} as the data storage. Neo4j is a graph database management system (GDB). Neo4j allows us to store nodes and edges to connecting them. We can to add additional attributes to nodes and edges. Neo4j has a high speed of operation even with a large amount of stored data.
GDB is a non-relational type of database based on the topographic structure of the network. GDBs are more flexible than relational databases. GDBs are more flexible than relational databases and allows us to fast obtain data of various types, considering numerous relations.
Cypher \cite{ref_cypher} provides a convenient way to express queries and other Neo4j actions. Although Cypher is particularly useful for exploratory work, it is fast enough to be used in production.
Also, we use the apoc.util.md5 plugin \cite{ref_md5}. This plugin allows us to compute the md5 of the concatenation of all string representations of the Neo4j entities list.
= And =
The main aim of the experiment is to determine the speed of the algorithm, considering the average number of lines of code processed per minute. We used the IntelliJ IDEA Statistic plugin \cite{ref_Statistic} to get the data for the experiment. The plugin allows you to calculate the number, size, number of lines, average value and other information for each file in the project. You can also find out the total number of rows, the number of lines of code, the proportion of lines of code, the number of comment lines, the proportion of comment lines, etc.
====
Q: ids from table 1 (now 2) are not explicitly described and formalized
A: We have added the following description:
The length of the collection in the ids column shows how many projects contains the i-th structural element. And the length of the element of this collection allows us to get the length of the chain of structural elements to calculate project originality degree.
We have also revised the Cypher-query. Now, its result does not contain duplicate values.
====
Q: What does F(*) means at page 3?
A: We have added the following:
In this algorithm, the $F$ is a search function that finds nested nodes. The function parameter is a node or subtree, and the output is a set of nodes with the desired type: class, class field, method, method argument or statements (operators).
====