## Anxinyun Analyze 安心云数据分析工具, ![image-20210927161719070](imgs/安心云数据分析工具/image-20210927161719070.png) 参考实现优矿: ![image-20210927161703234](imgs/安心云数据分析工具/image-20210927161703234.png) 数据分析工具框图 by Tyr.Liu ![img](imgs/安心云数据分析工具/anxinyun-jupyter-notebook.png) 启动: config.js ```js { entry: require('./middlewares/orchestrator').entry, opts: { kubernetes: { url: ANXINCLOUD_K8S_API || 'https://10.8.30.157:6443', insecureSkipTlsVerify: true, version: 'v1', promises: true, namespace: 'anxinyun', auth: { bearer: ANXINCLOUD_K8S_AUTH || '......' } }, //runInPod: true, apiUrl: ANXINCLOUD_API, notebookToken: '6bf509929765366acb8ef066aa30d2cfc57af186a25f229a', instanceName: 'anxinyun-jupyter-notebook', proxyPort: 18305 } } ``` ```sh set NODE_ENV=development&&node server -p 8000 -u http://10.8.30.157:19084 --qnak YwL-KPPPrPFqm5VfCDLSSePi6pa0c0rxbTDGVUSQ --qnsk dFHk_EfTk6ufIaG56h4gzcL3IrAtwl2RkJcl8XuO --qnbkt notebook-test --qndmn http://pcd3v07yz.bkt.clouddn.com # -u 数据API地址 ``` ## Jupyter Nootbook ![img](imgs/安心云数据分析工具/20180518221933692) 新的名称 Jupyter 由`Julia`、`Python` 和 `R` 组合而成 安装使用 ```sh pip3 install jupyter # 使用帮助 jupyter notebook -h # 启动 jupyter notebook ``` 访问 http://localhost:8888/ ![image-20210927091813110](imgs/安心云数据分析工具/image-20210927091813110.png) 修改文件路径 ```sh jupyter notebook --generate-config # file:///C:/Users/yww08/.jupyter/jupyter_notebook_config.py # c.NotebookApp.notebook_dir = 'E:/tmp/notebook' ``` Magic操作(基于IPython) ```python %%timeit 测算整个单元格的运行时间 ``` ![image-20210927162140861](imgs/安心云数据分析工具/image-20210927162140861.png) ## K8S API 鉴权 ```sh kubectl get sa -n anxinyun # clusterrole.yaml kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: namespace: anxinyun name: operator rules: - apiGroups: [""] # "" indicates the core API group resources: ["services","pods"] verbs: ["get", "watch", "list","create","update","patch"] kubectl create clusterrolebinding operator-pod \ --clusterrole=operator \ --serviceaccount=anxinyun:default fastest@test-master:~$ kubectl get sa -n anxinyun -o yaml apiVersion: v1 items: - apiVersion: v1 kind: ServiceAccount metadata: creationTimestamp: "2020-08-17T10:35:35Z" name: default namespace: anxinyun resourceVersion: "5982" selfLink: /api/v1/namespaces/anxinyun/serviceaccounts/default uid: a1100eea-19c2-4477-afca-61344353f2e5 secrets: - name: default-token-zp6cz kind: List metadata: resourceVersion: "" selfLink: "" fastest@test-master:~$ kubectl describe secret default-token-zp6cz -n anxinyun Name: default-token-zp6cz Namespace: anxinyun Labels: Annotations: kubernetes.io/service-account.name: default kubernetes.io/service-account.uid: a1100eea-19c2-4477-afca-61344353f2e5 Type: kubernetes.io/service-account-token Data ==== ca.crt: 1025 bytes namespace: 8 bytes token: eyJhbGciOiJSUzI1NiIsImtpZCI6ImFiVlF0Y1NyZjNNTkRVMFVieTNNTzhyVlc5T094Y3J2RmFfYTF6R0pveDQifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJhbnhpbnl1biIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkZWZhdWx0LXRva2VuLXpwNmN6Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImRlZmF1bHQiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiJhMTEwMGVlYS0xOWMyLTQ0NzctYWZjYS02MTM0NDM1M2YyZTUiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6YW54aW55dW46ZGVmYXVsdCJ9.elI35PPYtQp-fletleFR7so88Vozk7g8B7oRa1zy2LxSL1m26s8X6SJAipR5uqweNyi8JML3Yo3lPhs6mmzNLxkTRVk1atyXcCSr6J_iPD2dUUaGTL-ZPRYZ1x8Eb2PfugEQM5tf5YXERXqPpEsxTLM83KkI8ogFJQhLG7s-lWZFbvcgmKpCo3lmzuYf-hO0-JngjLhRxptCUqaFx6s8QwQz0dxNn_EvtMXbZm2cTkewJdsFAzczuKtt2sLiJCl5CSRghWAqkP9pBiC2diwDKzz9A0DevG0b3n7J-9_4fPtbXa5zQI60Rg3XVZRof0XNjw5Nze0ee8bn-6XI8yxIug fastest@test-master:~$ ``` ## 本地Micro-K8S 安装WSL 适用于 Linux 的 Windows 子系统可让开发人员直接在 Windows 上按原样运行 GNU/Linux 环境(包括大多数命令行工具、实用工具和应用程序),且不会产生传统虚拟机或双启动设置开销。 [安装 WSL](https://docs.microsoft.com/zh-cn/windows/wsl/setup/environment#set-up-your-linux-user-info) ```sh POWERSHELL 》wsl --install PS C:\Users\yww08> wsl --list --online 以下是可安装的有效分发的列表。 请使用“wsl --install -d <分发>”安装。 NAME FRIENDLY NAME Ubuntu Ubuntu Debian Debian GNU/Linux kali-linux Kali Linux Rolling openSUSE-42 openSUSE Leap 42 SLES-12 SUSE Linux Enterprise Server v12 Ubuntu-16.04 Ubuntu 16.04 LTS Ubuntu-18.04 Ubuntu 18.04 LTS Ubuntu-20.04 Ubuntu 20.04 LTS PS C:\Users\yww08> wsl --install -d Ubuntu-18.04 正在下载: Ubuntu 18.04 LTS [====================== 38.4% ] 创建linux用户 Installing, this may take a few minutes... Please create a default UNIX user account. The username does not need to match your Windows username. For more information visit: https://aka.ms/wslusers Enter new UNIX username: yww Enter new UNIX password: 123 Retype new UNIX password: 123 passwd: password updated successfully Installation successful! To run a command as administrator (user "root"), use "sudo ". See "man sudo_root" for details. 更新首选包管理器定期更新和升级包 sudo apt update && sudo apt upgrade 通过/mnt/c/ 可以访问宿主机上的文件 ``` 设置Windows终端: ![image-20210927163121708](imgs/安心云数据分析工具/image-20210927163121708.png) `ctrl+shift+d` 新Tab `alt+shift+d` Split窗口 `ctrl+shift+w` 关闭 ```sh #安装docker apt install docker.io sudo usermod -aG docker $USER sudo cgroupfs-mount sudo service docker start #systemctl daemon-reload #systemctl restart docker.service # 上面的安装方法有问题 curl https://get.docker.com | sh ``` 【MicroK8S】 [Github](https://github.com/ubuntu/microk8s) Install MicroK8s with: 尝试失败了!!! ``` snap install microk8s --classic ``` MicroK8s includes a `microk8s kubectl` command: ``` sudo microk8s kubectl get nodes sudo microk8s kubectl get services ``` To use MicroK8s with your existing kubectl: ``` sudo microk8s kubectl config view --raw > $HOME/.kube/config ``` 将用户添加如 microk8s用户组,以实现对k8s的访问 ``` sudo usermod -a -G microk8s ``` Kubernetes插件 MicroK8s installs a barebones upstream Kubernetes. Additional services like dns and the Kubernetes dashboard can be enabled using the `microk8s enable` command. ``` sudo microk8s enable dns dashboard ``` Use `microk8s status` to see a list of enabled and available addons. You can find the addon manifests and/or scripts under `${SNAP}/actions/`, with `${SNAP}` pointing by default to `/snap/microk8s/current`. **Copy from Kai.Lu** 镜像准备`fetch-images.sh` ```sh #!/bin/bash images=( k8s.gcr.io/pause:3.1=mirrorgooglecontainers/pause-amd64:3.1 gcr.io/google_containers/defaultbackend-amd64:1.4=mirrorgooglecontainers/defaultbackend-amd64:1.4 k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1=registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.1 k8s.gcr.io/heapster-influxdb-amd64:v1.3.3=registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-influxdb-amd64:v1.3.3 k8s.gcr.io/heapster-amd64:v1.5.2=registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-amd64:v1.5.2 k8s.gcr.io/heapster-grafana-amd64:v4.4.3=registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-grafana-amd64:v4.4.3 k8s.gcr.io/metrics-server-amd64:v0.3.6=registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server-amd64:v0.3.6 ) OIFS=$IFS; # 保存旧值 for image in ${images[@]};do IFS='=' set $image docker pull $2 docker tag $2 $1 docker rmi $2 docker save $1 > 1.tar && microk8s.ctr --namespace k8s.io image import 1.tar && rm 1.tar IFS=$OIFS; # 还原旧值 done ``` ```sh ./fetch-images.sh microk8s status --wait-ready alias mk='microk8s.kubectl' mk get pods -A ``` ## Python ```python #!/usr/bin/python # -*- coding: UTF-8 -*- print( "你好,世界" ) #Python3.X 源码文件默认使用utf-8编码,所以可以正常解析中文,无需指定 UTF-8 编码。 注释: # 单行 ''' 多行 ''' """ 多行 """ 类型: 数字 字符串 列表 list1 = ['Google', 'Runoob', 1997, 2000] 元祖 (元素值是不允许修改) (tup1 = ('physics', 'chemistry', 1997, 2000)) 字典 dict1 = { 'abc': 456 } 集合 a = set('abracadabra') 或者 可以用大括号({})创建集合。注意:如果要创建一个空集合,你必须用 set() 而不是 {} ;后者创建一个空的字典,下一节我们会介绍这个数据结构。 删除元素 del a['k'] range 时间: import time # 引入time模块 ticks = time.time() print "当前时间戳为:", ticks ``` | en((1, 2, 3)) | 3 | 计算元素个数 | | ---------------------------- | ---------------------------- | ------------ | | (1, 2, 3) + (4, 5, 6) | (1, 2, 3, 4, 5, 6) | 连接 | | ('Hi!',) * 4 | ('Hi!', 'Hi!', 'Hi!', 'Hi!') | 复制 | | 3 in (1, 2, 3) | True | 元素是否存在 | | for x in (1, 2, 3): print x, | 1 2 3 | 迭代 | 笔记 ```python a,b=0,1 while b<10: print(b,end=',') a,b=b,a+b print() # while else 循环 c=0 while c<10: print(c) c+=2 else: print("after c=",c) # set顺序是乱的 chars=set('abcdefg') for c in chars: print(c) # range([start,]stop[,step]) for i in range(5,21,2): print(i) # for array a=['a','b','c'] for i in range(len(a)): print(i,':',a[i]); # 空语句 可以用pass占位 ## 迭代器 a=[1,2,3,4] it=iter(a) print(next(it)) for i in it: print('in range:',i) # while next 写法 import sys list=[1,2,3,4] it = iter(list) # 创建迭代器对象 while True: try: print (next(it)) except StopIteration: print ('finished') break #sys.exit() print('is here?') # 通过yield生成斐波那契数列 def fibonacci(n): a,b,count=0,1,0 while True: if (count>n): return yield a a,b=b,a+b count+=1 f=fibonacci(10) for fi in f: print('f',fi,end=',') # 函数 def hello() : print("Hello World!") ret=hello() print('ret:'+str(ret)) ''' 不可变类型:strings,tuples,numbers 作为函数参数类似C++中值传递 可变类型: list,dict 作为函数参数类似C++中的引用传递 参数可以按名称传递、可以有默认值 可变长参数如下 ''' def printinfo(arg1,*vartuple): print(arg1) print(len(vartuple)) printinfo(7) # 加了两个星号 ** 的参数会以字典的形式导入。 sum=lambda a,b:a+b print(sum(1,20)) # 列表推导式 vec=[2,4,6] dd=[3*x for x in vec] print(dd) # 字典的便利 dics={'name':'ww','age':18} for k,v in dics.items(): print(k,'=',b) # 遍历技巧 # for i,v in enumerate(list) 同时获得索引和值 # zip(list1,list2) 组合两个序列 # reversed(seq) 反向 # sorted(seq) 排序 ## 类型转换 # int(x[, base]) 将x转换为一个整数,base为进制,默认十进制 # # long(x[, base] ) 将x转换为一个长整数 # # float(x) 将x转换到一个浮点数 # # complex(real[, imag]) 创建一个复数 # # str(x) 将对象 x 转换为字符串 # # repr(x) 将对象 x 转换为表达式字符串 # # eval(str) 用来计算在字符串中的有效Python表达式, 并返回一个对象 # # tuple(s) 将序列 s 转换为一个元组 # # list(s) 将序列 s 转换为一个列表 # # set(s) 转换为可变集合 # # dict(d) 创建一个字典。d 必须是一个序列(key, value) 元组。 # # frozenset(s) 转换为不可变集合 # # chr(x) 将一个整数转换为一个字符 # # unichr(x) 将一个整数转换为Unicode字符 # # ord(x) 将一个字符转换为它的整数值 # # hex(x) 将一个整数转换为一个十六进制字符串 # # oct(x) 将一个整数转换为一个八进制字符串 ``` 模块学习: `m.py` ```python # !/usr/bin/python3 if __name__=='__main__': # 程序独立运行 pass else: # 程序被模块调用 print('moduled') def fabonacci(n): a,b,c=0,1,0 while c ```python try: ... except OSError as err: print('OS Error: {0}'.format(err)) except (RuntimeError,TypeError,NameError): pass except: raise else: # 没有发生异常时执行 ... finally: # 永远执行 ... ``` 面向对象 ```python #!/usr/bin/python3 class people: name='' # 构造函数 def __init__(self,n,a): self.name=n self.age=a # 类方法 def speak(self): print("%s speak age %d"%(self.name,self.age)) # 继承 (同时支持多继承,从左往右的规则搜索父类方法) class student(people): grade='' # 私有变量 __private_attrs=0 def __init__(self,n,a,g): people.__init__(self,n,a) self.grade=g # 覆盖 def speak(self): print("{} speak age {} grade {}".format(self.name,self.age,self.grade)) # 析构函数 def __del__(self): pass # 打印 def __repr__(self): return "myAge:{age}".format(age=self.age) st = student('ak',16,'g3') st.speak() super(student,st).speak() # 调用父类已被覆盖的方法 print(repr(st)) ``` > mirror: http://mirrors.aliyun.com/pypi/simple/ 作用域: Python 中只有模块(module),类(class)以及函数(def、lambda)才会引入新的作用域,其它的代码块(如 if/elif/else/、try/except、for/while等)是不会引入新的作用域的,也就是说这些语句内定义的变量,外部也可以访问,如下代码: ```python ``` global 和 nonlocal关键字用于在指定作用域内修改全局或闭包外部作用域内的变量 ### 标准库 [官方中文文档目录](https://docs.python.org/zh-cn/3.7/library/index.html) HTTP请求 ```python from urllib.request import urlopen for line in urlopen('http://baidu.com'): line = line.decode('utf-8') print(line) # pip install requests import requests requests=requests.get('http://baidu.com') print(requests.content) ``` 日志 ```python import logging logging.warning("warnmsg") log=logging.getLogger('sk') # logging.debug(msg, *args, **kwargs) log.warning('hello') # 输出 日志级别:日志器名称:日志内容 # 默认是指格式 BASIC_FORMAT "%(levelname)s:%(name)s:%(message)s" LOG_FORMAT = "%(asctime)s - %(levelname)s - %(message)s" DATE_FORMAT = "%m/%d/%Y %H:%M:%S %p" logging.basicConfig(filename='my.log', level=logging.DEBUG, format=LOG_FORMAT, datefmt=DATE_FORMAT) log=logging.getLogger("filelog") log.info("this is a log record") # 打印错误信息和堆栈 logging.warning("Some one delete the log file.", exc_info=True, stack_info=True, extra={'user': 'Tom', 'ip':'47.98.53.222'}) ``` 多线程 ```python from threading import Timer import time # 定时 def hello(): print('hello') t=Timer(4.0,hello) t.start() while True: time.sleep(1) ``` 数学库 | Quantum Computing | Statistical Computing | Signal Processing | Image Processing | Graphs and Networks | Astronomy Processes | Cognitive Psychology | | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | | ![A computer chip.](https://numpy.org/images/content_images/sc_dom_img/quantum_computing.svg) | ![A line graph with the line moving up.](https://numpy.org/images/content_images/sc_dom_img/statistical_computing.svg) | ![A bar chart with positive and negative values.](https://numpy.org/images/content_images/sc_dom_img/signal_processing.svg) | ![An photograph of the mountains.](https://numpy.org/images/content_images/sc_dom_img/image_processing.svg) | ![A simple graph.](https://numpy.org/images/content_images/sc_dom_img/sd6.svg) | ![A telescope.](https://numpy.org/images/content_images/sc_dom_img/astronomy_processes.svg) | ![A human head with gears.](https://numpy.org/images/content_images/sc_dom_img/cognitive_psychology.svg) | | [QuTiP](http://qutip.org/) | [Pandas](https://pandas.pydata.org/) | [SciPy](https://www.scipy.org/) | [Scikit-image](https://scikit-image.org/) | [NetworkX](https://networkx.org/) | [AstroPy](https://www.astropy.org/) | [PsychoPy](https://www.psychopy.org/) | | [PyQuil](https://pyquil-docs.rigetti.com/en/stable) | [statsmodels](https://github.com/statsmodels/statsmodels) | [PyWavelets](https://pywavelets.readthedocs.io/) | [OpenCV](https://opencv.org/) | [graph-tool](https://graph-tool.skewed.de/) | [SunPy](https://github.com/sunpy/sunpy) | | | [Qiskit](https://qiskit.org/) | [Xarray](https://xarray.pydata.org/en/stable/) | [python-control](https://python-control.org/) | [Mahotas](https://mahotas.rtfd.io/) | [igraph](https://igraph.org/python/) | [SpacePy](https://github.com/spacepy/spacepy) | | | | [Seaborn](https://github.com/mwaskom/seaborn) | | | [PyGSP](https://pygsp.rtfd.io/) | | | | Bioinformatics | Bayesian Inference | Mathematical Analysis | Chemistry | Geoscience | Geographic Processing | Architecture & Engineering | | ![A strand of DNA.](https://numpy.org/images/content_images/sc_dom_img/bioinformatics.svg) | ![A graph with a bell-shaped curve.](https://numpy.org/images/content_images/sc_dom_img/bayesian_inference.svg) | ![Four mathematical symbols.](https://numpy.org/images/content_images/sc_dom_img/mathematical_analysis.svg) | ![A test tube.](https://numpy.org/images/content_images/sc_dom_img/chemistry.svg) | ![The Earth.](https://numpy.org/images/content_images/sc_dom_img/geoscience.svg) | ![A map.](https://numpy.org/images/content_images/sc_dom_img/GIS.svg) | ![A microprocessor development board.](https://numpy.org/images/content_images/sc_dom_img/robotics.svg) | | [BioPython](https://biopython.org/) | [PyStan](https://pystan.readthedocs.io/en/latest/) | [SciPy](https://www.scipy.org/) | [Cantera](https://cantera.org/) | [Pangeo](https://pangeo.io/) | [Shapely](https://shapely.readthedocs.io/) | [COMPAS](https://compas.dev/) | | [Scikit-Bio](http://scikit-bio.org/) | [PyMC3](https://docs.pymc.io/) | [SymPy](https://www.sympy.org/) | [MDAnalysis](https://www.mdanalysis.org/) | [Simpeg](https://simpeg.xyz/) | [GeoPandas](https://geopandas.org/) | [City Energy Analyst](https://cityenergyanalyst.com/) | | [PyEnsembl](https://github.com/openvax/pyensembl) | [ArviZ](https://arviz-devs.github.io/arviz/) | [cvxpy](https://github.com/cvxgrp/cvxpy) | [RDKit](https://github.com/rdkit/rdkit) | [ObsPy](https://github.com/obspy/obspy/wiki) | [Folium](https://python-visualization.github.io/folium) | [Sverchok](https://nortikin.github.io/sverchok/) | | [ETE](http://etetoolkit.org/) | [emcee](https://emcee.readthedocs.io/) | [FEniCS](https://fenicsproject.org/) | | [Fatiando a Terra](https://www.fatiando.org/) | | | - #### [NumPy](http://numpy.org/) 提供基础的N维数组 - [![scipy](https://www.scipy.org/_static/images/scipy_med.png)](https://www.scipy.org/scipylib/index.html) #### [SciPy library](https://www.scipy.org/scipylib/index.html) 科学计算基础包 - [![matplotlib](https://www.scipy.org/_static/images/matplotlib_med.png)](http://matplotlib.org/) #### [Matplotlib](http://matplotlib.org/) 综合的2D图形包 - [![ipython](https://www.scipy.org/_static/images/ipython.png)](http://ipython.org/) #### [IPython](http://ipython.org/) 增强的交互控制台 - [![sympy](https://www.scipy.org/_static/images/sympy_logo.png)](http://sympy.org/) #### [SymPy](http://sympy.org/) 符号数学 - [![pandas badge](https://www.scipy.org/_static/images/pandas_badge2.jpg)](http://pandas.pydata.org/) #### [pandas](http://pandas.pydata.org/) Data structures & analysis 数据结构化和分析工具 Pandas有数据类型 Series / DataFrame pd.read_csv CSV《==》 DataFrame pd.to_csv pd.DataFrame(str) || pd.read_json(str) pd.read_json JSON <==> DataFrame 数据DataFrame的简单操作: ```python import pandas as pd import numpy as np data=pd.DataFrame(pd.read_csv('data.csv',encoding='gbk')) data=data.sort_values('采集时间',ascending=True) data.head(10) # 前10个 data.loc[data['幅值(mv)']>220] # 过滤幅值大于220的 data.loc[(data['幅值(mv)']>220) & (data['采集时间']>'2021-01-30 22:40:47'),['设备','采集时间','幅值(mv)']].head() data['幅值(mv)'].sum() # count求和 mean平均 # 时间格式 data['date']=pd.to_datetime(data['采集时间']) # 循环处理 for x in data.index: data.loc[x,'value'] ``` JSON处理示例: ```python import pandas as pd import json # 使用 Python JSON 模块载入数据 with open('nested_list.json','r') as f: data = json.loads(f.read()) # 展平数据 df_nested_list = pd.json_normalize(data, record_path =['students']) // students字段嵌套 print(df_nested_list) ```