1. Hadoop Overview HADOOP is a set of open source software platform under Apache. It uses server clusters to distribute distributed data according to user's custom business logic. The core components of HADOOP are: 2. The concept of HDFS Hdfs is a file system for storing files and locating files through a unified namespace-directory tree. It is distributed and is federated by a number of servers that have their own roles. The main features are as follows The files in HDFS are physically block-blocked. The size of the block can be specified by the configuration parameter ( dfs.blocksize). The default size is 128M in the hadoop2.x version and 64M in the old version. The HDFS file system will provide the client with a unified abstract directory tree. The client accesses the file through the path, such as: hdfs://namenode:port/dir-a/dir-b/dir-c/file.data Directory structure and file block information (metadata) management is carried out by the namenode node - the namenode is the HDFS cluster master node, responsible for maintaining the directory tree of the entire hdfs file system, and the block block information corresponding to each path (file) ( Block id, and the datanode server where it is located) The storage management of each block of the piece is undertaken by the datanode node—the datanode is the HDFS cluster slave node, and each block can store multiple copies on multiple datanodes (the number of copies can also be set by parameter dfs.replicaTIon) HDFS is designed to adapt to write once, read multiple times, and does not support file modification. 3. Scenes where hdfs is not applicable Low-latency data access: The strength of hdfs is in a large amount of data transmission, the delay is not suitable for him, and the access of 10 milliseconds can ignore hdfs, but hbase can make up for this defect. Too many small files: The namenode node holds the metadata of the entire file system in memory, so the number of files is limited. The metadata of each file is about 150 bytes, 1 million files, and each file only For a block, you need 300MB of memory. How much can your server hold? Multiple writes and random modifications: Multiple writes are currently not supported and random modifications are made by offsets. 4. namenodes and datanodes The hdfs cluster has two types of nodes, one is master and namenode, and the other is worker and datanodes. The namenode node manages the namespace of the file system. It contains a tree of file systems. The original data for all files and directories is in this tree. This information is stored in two files on the local disk, the image file and the edit log file. Where the file-related block exists, where is the block, this information is loaded into the namenode's memory at system startup and is not stored on disk. The role of the datanode node in the file system is coolie, storing or retrieving blocks according to the instructions of the namenode and client, and periodically reporting to the namenode node which blocks of files it stores. 5. secondarynamenode It is a snapshot of the namenode, which will determine how much time to periodically cp the namenode based on the value set in configuraTIon, record the metadata and other data in the namenode. 6. NodeManager (NM) NodeManager is a proxy on each node in YARN. It manages a single compute node in a Hadoop cluster, including communicating with ResourceManger, supervising the lifecycle management of the Container, monitoring the resource usage (memory, CPU, etc.) of each Container, and tracking node health. situation. 7. ResourceManager In YARN, the ResourceManager is responsible for the unified management and allocation of all resources in the cluster. It receives resource reporting information from each node (NodeManager) and assigns this information to each application according to a certain policy (actually ApplicaTIonManagerRM and each The node's NodeManagers (NMs) work with each application ApplicaTIonMasters (AMs), and b.ApplicationMasters is responsible for negotiating resources with the ResourceManager to start the container with the NodeManagers. 8. hdfs client terminal common command hadoop fs-ls / / / If prompted warning: WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... Cause: The system pre-installed glibc library is 2.12 version, and hadoop Expected to be version 2.14, so print a warning message. Solution: Remove the alarm information from the log4j log and add it to the /hadoop-2.5.2/etc/hadoop/log4j.properties file: log4j.logger.org.apache.hadoop.util.NativeCodeLoader=ERROR-ls Function: Display Directory information example: hadoop fs-lshdfs://localhost:9000/Remarks: All hdfs paths in these parameters can be abbreviated --> hadoop fs-ls/ is equivalent to the effect of the previous command - mkdir function: on hdfs Example of creating a directory: hadoop fs-mkdir-p/aaa/bbb/cc/dd-moveFromLocal function: From local clipping to hdfs Example: hadoop fs-moveFromLocal /home/hadoop/a.txt /aaa/bbb/cc/ dd-moveToLocal function: cut and paste from hdfs to local example: hadoop fs-moveToLocal /aaa/bbb/cc/dd /home/hadoop/a.txt--appendToFile Function: Append a file to the end of an existing file: Hadoop fs-appendToFile./hello.txt hdfs://hadoop-server01:9000/hello.txt can be abbreviated as: Hadoop fs-appendToFile./hello.txt /hello.txt-cat function: display file content example: hadoop fs -cat/hello.txt-chmod function: the usage in the linux file system is the same, right Example of permission to belong: hadoop fs-chmod666/hello.txt-copyFromLocal function: copy files from local file system to hdfs path to example: hadoop fs-copyFromLocal./jdk.tar.gz /aaa/-copyToLocal function: from hdfs Copy to local example: hadoop fs-copyToLocal/aaa/jdk.tar.gz-cp function: Another example of copying hdfs from a path to hdfs: hadoop fs-cp/aaa/jdk.tar.gz /bbb/jdk .tar.gz.2-mv function: Example of moving files in hdfs directory: hadoop fs-mv/aaa/jdk.tar.gz/-get function: equivalent to copyToLocal, is to download files from hdfs to local example: hadoop fs- Get/aaa/jdk.tar.gz-getmerge function: merge and download multiple file examples: for example, hdfs directory /aaa/ has multiple files: log.1, log.2, log.3,...hadoop fs -getmerge/aaa/log.*./log.sum-put function: equivalent to copyFromLocal Example: hadoop fs-put/aaa/jdk.tar.gz /bbb/jdk.tar.gz.2-rm function: delete file Or folder example: hadoop fs-rm-r/aaa/bbb/-rmdir function: delete empty directory example: hadoop fs-rmdir/aaa/bbb/ccc-df function: statistics file system free space information Example: hadoop fs-df-h/-du function: Statistics folder size information example: hadoop fs-du-sh/aaa/* -count Function: Count the number of file nodes in a specified directory: hadoop fs -count /aaa/9. java operation HDFS1. Introduced dependency org.apache.hadoophadoop-client2.6.12. Test class publicclassOperationFileTest{FileSystem fs =null; Configuration conf =null;/** * Initialize client instance object for file system operation*/ @Beforepublic void init() throwsIOException, URISyntaxException, InterruptedException { conf =newConfiguration();// uri and user identity ---> client instance object for a filesystem operation fs = FileSystem.get(newURI("hdfs:// 192.168.133.11:9000"),conf,"root"); }/** * Upload file */@Testpublic void testUpload() throwsIOException { fs.copyFromLocalFile(newPath("D:\\examProject.rar"),newPath ("/examProject.rar")); fs.close(); }/** * Download file */@Testpublic void testDownload() throwsException{ Path remotePath =newPath("/examProject.rar"); Path localPath =newPath ("f:/"); fs.copyToLocalFile(remotePath, localPath); fs.close(); }/** * conf loaded content */@Testpublic void testConf(){ Iterator> iterator = conf.iterator();while(iterator.hasNext()){ Map.Entry item = iterator.next(); System.out.println(item.getValue()+"--"+item.getValue()); } }/** * Create directory */@Testpublic void makdirTest() throwsException { boolean Mkdirs = fs.mkdirs(newPath("/a/b")); System.out.println(mkdirs); }/** * Delete */@Testpublic void deleteTest() throwsException{ boolean delete = fs.delete(newPath ("/a"), true); / / true, recursively delete System.out.println(delete); }/** * List the file path in the root directory */@Test//public void listTest() throwsException { FileStatus[] listStatus = fs.listStatus(newPath("/"));for(FileStatus fileStatus : listStatus) { System.err.println(fileStatus.getPath()+"=========== ======"+fileStatus.toString()); }// will recursively find all files RemoteIteratorlistFiles = fs.listFiles(newPath("/"),true);while(listFiles.hasNext()){ LocatedFileStatus Next = listFiles.next(); String name = next.getPath().get Name(); Path path = next.getPath(); System.out.println(name +"---"+ path.toString()); } } } ,> 60V Lithium Ion Battery,Li Ion Battery 60V,Li Ion Vs Lifepo4 Charger,12V 60Ah Lithium Ion Battery Langrui Energy (Shenzhen) Co.,Ltd , https://www.langruibattery.com
HDFS (Distributed File System)
YARN (Operation Resource Scheduling System)
MAPREDUCE (Distributed Computing Programming Framework)