ito-archived

26 KiB

Raw Blame History

Anxinyun Analyze

安心云数据分析工具，

参考实现优矿：

数据分析工具框图 by Tyr.Liu

启动：

config.js

{
    entry: require('./middlewares/orchestrator').entry,
        opts: {
            kubernetes: {
                url: ANXINCLOUD_K8S_API || 'https://10.8.30.157:6443',
                    insecureSkipTlsVerify: true,
                        version: 'v1',
                            promises: true,
                                namespace: 'anxinyun',
                                    auth: {
                                        bearer: ANXINCLOUD_K8S_AUTH || '......'
                                    }
            },
                //runInPod: true,
                apiUrl: ANXINCLOUD_API,
                    notebookToken: '6bf509929765366acb8ef066aa30d2cfc57af186a25f229a',
                        instanceName: 'anxinyun-jupyter-notebook',
                            proxyPort: 18305
        }
}

set NODE_ENV=development&&node server -p 8000 -u http://10.8.30.157:19084 --qnak YwL-KPPPrPFqm5VfCDLSSePi6pa0c0rxbTDGVUSQ --qnsk dFHk_EfTk6ufIaG56h4gzcL3IrAtwl2RkJcl8XuO --qnbkt notebook-test --qndmn http://pcd3v07yz.bkt.clouddn.com
# -u 数据API地址

Jupyter Nootbook

新的名称 Jupyter 由Julia、Python 和 R 组合而成

安装使用

pip3 install jupyter
# 使用帮助
jupyter notebook -h

# 启动
jupyter notebook

访问 http://localhost:8888/

修改文件路径

jupyter notebook --generate-config
 
# file:///C:/Users/yww08/.jupyter/jupyter_notebook_config.py
# c.NotebookApp.notebook_dir = 'E:/tmp/notebook'

Magic操作（基于IPython）

%%timeit
测算整个单元格的运行时间

K8S API 鉴权

kubectl get sa -n anxinyun

# clusterrole.yaml
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: anxinyun
  name: operator
rules:
- apiGroups: [""] # "" indicates the core API group
  resources: ["services","pods"]
  verbs: ["get", "watch", "list","create","update","patch"]
  
  
kubectl create clusterrolebinding operator-pod \
  --clusterrole=operator  \
  --serviceaccount=anxinyun:default
  
  
  
fastest@test-master:~$ kubectl get sa -n anxinyun -o yaml
apiVersion: v1
items:
- apiVersion: v1
  kind: ServiceAccount
  metadata:
    creationTimestamp: "2020-08-17T10:35:35Z"
    name: default
    namespace: anxinyun
    resourceVersion: "5982"
    selfLink: /api/v1/namespaces/anxinyun/serviceaccounts/default
    uid: a1100eea-19c2-4477-afca-61344353f2e5
  secrets:
  - name: default-token-zp6cz
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""
fastest@test-master:~$ kubectl describe secret default-token-zp6cz -n anxinyun
Name:         default-token-zp6cz
Namespace:    anxinyun
Labels:       <none>
Annotations:  kubernetes.io/service-account.name: default
              kubernetes.io/service-account.uid: a1100eea-19c2-4477-afca-61344353f2e5

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1025 bytes
namespace:  8 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6ImFiVlF0Y1NyZjNNTkRVMFVieTNNTzhyVlc5T094Y3J2RmFfYTF6R0pveDQifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJhbnhpbnl1biIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkZWZhdWx0LXRva2VuLXpwNmN6Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImRlZmF1bHQiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiJhMTEwMGVlYS0xOWMyLTQ0NzctYWZjYS02MTM0NDM1M2YyZTUiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6YW54aW55dW46ZGVmYXVsdCJ9.elI35PPYtQp-fletleFR7so88Vozk7g8B7oRa1zy2LxSL1m26s8X6SJAipR5uqweNyi8JML3Yo3lPhs6mmzNLxkTRVk1atyXcCSr6J_iPD2dUUaGTL-ZPRYZ1x8Eb2PfugEQM5tf5YXERXqPpEsxTLM83KkI8ogFJQhLG7s-lWZFbvcgmKpCo3lmzuYf-hO0-JngjLhRxptCUqaFx6s8QwQz0dxNn_EvtMXbZm2cTkewJdsFAzczuKtt2sLiJCl5CSRghWAqkP9pBiC2diwDKzz9A0DevG0b3n7J-9_4fPtbXa5zQI60Rg3XVZRof0XNjw5Nze0ee8bn-6XI8yxIug
fastest@test-master:~$

本地Micro-K8S

安装WSL

适用于 Linux 的 Windows 子系统可让开发人员直接在 Windows 上按原样运行 GNU/Linux 环境（包括大多数命令行工具、实用工具和应用程序），且不会产生传统虚拟机或双启动设置开销。

安装 WSL

POWERSHELL
》wsl --install

PS C:\Users\yww08> wsl --list --online
以下是可安装的有效分发的列表。
请使用“wsl --install -d <分发>”安装。

NAME            FRIENDLY NAME
Ubuntu          Ubuntu
Debian          Debian GNU/Linux
kali-linux      Kali Linux Rolling
openSUSE-42     openSUSE Leap 42
SLES-12         SUSE Linux Enterprise Server v12
Ubuntu-16.04    Ubuntu 16.04 LTS
Ubuntu-18.04    Ubuntu 18.04 LTS
Ubuntu-20.04    Ubuntu 20.04 LTS
PS C:\Users\yww08> wsl --install -d Ubuntu-18.04
正在下载: Ubuntu 18.04 LTS
[======================    38.4%                           ]

创建linux用户
Installing, this may take a few minutes...
Please create a default UNIX user account. The username does not need to match your Windows username.
For more information visit: https://aka.ms/wslusers
Enter new UNIX username: yww
Enter new UNIX password: 123
Retype new UNIX password: 123
passwd: password updated successfully
Installation successful!
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.

更新首选包管理器定期更新和升级包
sudo apt update && sudo apt upgrade

通过/mnt/c/ 可以访问宿主机上的文件

设置Windows终端：

ctrl+shift+d 新Tab

alt+shift+d Split窗口

ctrl+shift+w 关闭

#安装docker
apt install docker.io

sudo usermod -aG docker $USER

sudo cgroupfs-mount
sudo service docker start 

#systemctl daemon-reload
#systemctl restart docker.service

# 上面的安装方法有问题
curl https://get.docker.com | sh

【MicroK8S】 Github

Install MicroK8s with:

尝试失败了!!!

snap install microk8s --classic

MicroK8s includes a microk8s kubectl command:

sudo microk8s kubectl get nodes
sudo microk8s kubectl get services

To use MicroK8s with your existing kubectl:

sudo microk8s kubectl config view --raw > $HOME/.kube/config

将用户添加如 microk8s用户组，以实现对k8s的访问

sudo usermod -a -G microk8s <username>

Kubernetes插件

MicroK8s installs a barebones upstream Kubernetes. Additional services like dns and the Kubernetes dashboard can be enabled using the microk8s enable command.

sudo microk8s enable dns dashboard

Use microk8s status to see a list of enabled and available addons. You can find the addon manifests and/or scripts under ${SNAP}/actions/, with ${SNAP} pointing by default to /snap/microk8s/current.

Copy from Kai.Lu

镜像准备fetch-images.sh

#!/bin/bash
images=(
k8s.gcr.io/pause:3.1=mirrorgooglecontainers/pause-amd64:3.1
gcr.io/google_containers/defaultbackend-amd64:1.4=mirrorgooglecontainers/defaultbackend-amd64:1.4
k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1=registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.1
k8s.gcr.io/heapster-influxdb-amd64:v1.3.3=registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-influxdb-amd64:v1.3.3
k8s.gcr.io/heapster-amd64:v1.5.2=registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-amd64:v1.5.2
k8s.gcr.io/heapster-grafana-amd64:v4.4.3=registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-grafana-amd64:v4.4.3
k8s.gcr.io/metrics-server-amd64:v0.3.6=registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server-amd64:v0.3.6
)

OIFS=$IFS; # 保存旧值

for image in ${images[@]};do
    IFS='='
    set $image
    docker pull $2
    docker tag  $2 $1
    docker rmi  $2
    docker save $1 > 1.tar && microk8s.ctr --namespace k8s.io image import 1.tar && rm 1.tar
    IFS=$OIFS; # 还原旧值
done

./fetch-images.sh

microk8s status --wait-ready

alias mk='microk8s.kubectl'

mk get pods -A

Python

#!/usr/bin/python
# -*- coding: UTF-8 -*-
 
print( "你好，世界" )

#Python3.X 源码文件默认使用utf-8编码，所以可以正常解析中文，无需指定 UTF-8 编码。

注释：
# 单行
'''
多行
'''

"""
多行
"""

类型：
数字
字符串
列表 list1 = ['Google', 'Runoob', 1997, 2000]
元祖 （元素值是不允许修改） (tup1 = ('physics', 'chemistry', 1997, 2000))
字典 dict1 = { 'abc': 456 }
集合 a = set('abracadabra') 或者 可以用大括号({})创建集合。注意：如果要创建一个空集合，你必须用 set() 而不是 {} ；后者创建一个空的字典，下一节我们会介绍这个数据结构。


删除元素 del a['k']
range

时间：
import time  # 引入time模块
 
ticks = time.time()
print "当前时间戳为:", ticks

en((1, 2, 3))	3	计算元素个数
(1, 2, 3) + (4, 5, 6)	(1, 2, 3, 4, 5, 6)	连接
('Hi!',) * 4	('Hi!', 'Hi!', 'Hi!', 'Hi!')	复制
3 in (1, 2, 3)	True	元素是否存在
for x in (1, 2, 3): print x,	1 2 3	迭代

笔记

a,b=0,1
while b<10:
	print(b,end=',')
	a,b=b,a+b
    
print()
# while else 循环
c=0
while c<10:
    print(c)
    c+=2
else:
    print("after c=",c)
    
# set顺序是乱的
chars=set('abcdefg')
for c in chars:
    print(c)
    
# range([start,]stop[,step])
for i in range(5,21,2):
    print(i)
    
# for array
a=['a','b','c']
for i in range(len(a)):
    print(i,':',a[i]);
    
# 空语句 可以用pass占位

## 迭代器
a=[1,2,3,4]
it=iter(a)
print(next(it))
for i in it:
    print('in range:',i)
    
# while next 写法
import sys
list=[1,2,3,4]
it = iter(list)    # 创建迭代器对象
 
while True:
    try:
        print (next(it))
    except StopIteration:
        print ('finished')
        break
        #sys.exit()
        
print('is here?')

# 通过yield生成斐波那契数列
def fibonacci(n):
    a,b,count=0,1,0
    while True:
        if (count>n):
            return
        yield a
        a,b=b,a+b
        count+=1
        
f=fibonacci(10)
for fi in f:
    print('f',fi,end=',')
    

# 函数
def hello() :
    print("Hello World!")

ret=hello()
print('ret:'+str(ret))

'''
不可变类型：strings,tuples,numbers  作为函数参数类似C++中值传递
可变类型: list,dict 作为函数参数类似C++中的引用传递

参数可以按名称传递、可以有默认值
可变长参数如下
'''
def printinfo(arg1,*vartuple):
    print(arg1)
    print(len(vartuple))
printinfo(7)

# 加了两个星号 ** 的参数会以字典的形式导入。

sum=lambda a,b:a+b
print(sum(1,20))

# 列表推导式
vec=[2,4,6]
dd=[3*x for x in vec]
print(dd)

# 字典的便利
dics={'name':'ww','age':18}
for k,v in dics.items():
    print(k,'=',b)
    
# 遍历技巧
# for i,v in enumerate(list) 同时获得索引和值
# zip(list1,list2) 组合两个序列
# reversed(seq) 反向
# sorted(seq) 排序

## 类型转换
# int(x[, base]) 将x转换为一个整数，base为进制，默认十进制
#
# long(x[, base] ) 将x转换为一个长整数
#
# float(x) 将x转换到一个浮点数
#
# complex(real[, imag])  创建一个复数
#
# str(x) 将对象 x 转换为字符串
#
# repr(x) 将对象 x 转换为表达式字符串
#
# eval(str)  用来计算在字符串中的有效Python表达式, 并返回一个对象
#
# tuple(s) 将序列 s 转换为一个元组
#
# list(s) 将序列 s 转换为一个列表
#
# set(s) 转换为可变集合
#
# dict(d) 创建一个字典。d 必须是一个序列(key, value) 元组。
#
# frozenset(s) 转换为不可变集合
#
# chr(x) 将一个整数转换为一个字符
#
# unichr(x) 将一个整数转换为Unicode字符
#
# ord(x) 将一个字符转换为它的整数值
#
# hex(x) 将一个整数转换为一个十六进制字符串
#
# oct(x)  将一个整数转换为一个八进制字符串

模块学习：

m.py

# !/usr/bin/python3
if __name__=='__main__':
    # 程序独立运行
    pass
else:
    # 程序被模块调用
    print('moduled')
    
def fabonacci(n):
    a,b,c=0,1,0
    while c<n:
        yield b
        a,b=b,a+b
        c+=1

调用模块：

import m

ret=m.fabonacci(10)

for r in ret:
    print(r)

包的概念：

文件夹包含 __init__.py

sound/                          顶层包
      __init__.py               初始化 sound 包
      formats/                  文件格式转换子包
              __init__.py
              wavread.py
              wavwrite.py
              aiffread.py
              aiffwrite.py
              auread.py
              auwrite.py
              ...
      effects/                  声音效果子包
              __init__.py
              echo.py
              surround.py
              reverse.py
              ...
      filters/                  filters 子包
              __init__.py
              equalizer.py
              vocoder.py
              karaoke.py
              ...

导入方法中from package import item，item既可以是子模块（子包），也可以是包里面定义的内容（函数或变量）

导入方法中from sound.effects import * 如果这个包里面有子模块，需要定义 __all__变量来说明

__all__ = ['echo','surround','reverse']

输入输出和文件操作

# rjust 右对齐
for x in range(1,11):
    print(repr(x).rjust(2),repr(x*x).rjust(3),repr(x*x*x).rjust(4),end=' ') 
    print()
    
print('name is {0},age {1},alias {alias}'.format('ww',18,alias='peter'))

# 读文件
with open('foo.txt','r') as f:
    # 也可以 f.readlines / f.read(length)
    for line in f:
        print (line,end='')

print(f.closed)

# 写文件
with open('bar.txt','w') as f:
    for a in range(0,10):
        f.write(str(a))

f=open('bar.txt','r')
print(f.readlines())
f.close

# 通过pickle实现序列化和反序列化
import pickle,pprint

data1={
    'a':[1,2.0,4+3j],
    'b':('text',u'unicode text'),
    'c':None
}

selfref_list = [1, 2, 3]
selfref_list.append(selfref_list)

output=open('data.pk1','wb')
pickle.dump(data1,output)
pickle.dump(selfref_list,output,-1)
output.close()

# 反序列化
input=open('data.pk1','rb')
data1=pickle.load(input)
pprint.pprint(data1)
pprint.pprint(pickle.load(input))

错误异常

try:
    ...
except OSError as err:
    print('OS Error: {0}'.format(err))
except (RuntimeError,TypeError,NameError):
	pass
except:
    raise
else: # 没有发生异常时执行
    ...
finally: # 永远执行
    ...

面向对象

#!/usr/bin/python3

class people:
    name=''
    
    # 构造函数
    def __init__(self,n,a):
        self.name=n
        self.age=a
    
    # 类方法
    def speak(self):
        print("%s speak age %d"%(self.name,self.age))

# 继承 (同时支持多继承，从左往右的规则搜索父类方法)
class student(people):
    grade=''
    
    # 私有变量
    __private_attrs=0
    
    def __init__(self,n,a,g):
        people.__init__(self,n,a)
        self.grade=g
        
    # 覆盖
    def speak(self):
        print("{} speak age {} grade {}".format(self.name,self.age,self.grade))
        
    # 析构函数
    def __del__(self):
        pass
        
    # 打印
    def __repr__(self):
        return "myAge:{age}".format(age=self.age)
        
st = student('ak',16,'g3')
st.speak()
super(student,st).speak() # 调用父类已被覆盖的方法
print(repr(st))

mirror: http://mirrors.aliyun.com/pypi/simple/

作用域：

Python 中只有模块（module），类（class）以及函数（def、lambda）才会引入新的作用域，其它的代码块（如 if/elif/else/、try/except、for/while等）是不会引入新的作用域的，也就是说这些语句内定义的变量，外部也可以访问，如下代码：

global 和 nonlocal关键字用于在指定作用域内修改全局或闭包外部作用域内的变量

标准库官方中文文档目录

HTTP请求

from urllib.request import urlopen
for line in urlopen('http://baidu.com'):
    line = line.decode('utf-8')
    print(line)
    
# pip install requests
import requests

requests=requests.get('http://baidu.com')
print(requests.content)

日志

import logging

logging.warning("warnmsg")

log=logging.getLogger('sk')

# logging.debug(msg, *args, **kwargs)	
log.warning('hello')

# 输出 日志级别:日志器名称:日志内容
# 默认是指格式 BASIC_FORMAT "%(levelname)s:%(name)s:%(message)s"

LOG_FORMAT = "%(asctime)s - %(levelname)s - %(message)s"
DATE_FORMAT = "%m/%d/%Y %H:%M:%S %p"
logging.basicConfig(filename='my.log', level=logging.DEBUG, format=LOG_FORMAT, datefmt=DATE_FORMAT)
log=logging.getLogger("filelog")
log.info("this is a log record")

# 打印错误信息和堆栈
logging.warning("Some one delete the log file.", exc_info=True, stack_info=True, extra={'user': 'Tom', 'ip':'47.98.53.222'})

多线程

from threading import Timer
import time

# 定时
def hello():
    print('hello')
    
t=Timer(4.0,hello)
t.start()

while True:
    time.sleep(1)

数学库

Quantum Computing	Statistical Computing	Signal Processing	Image Processing	Graphs and Networks	Astronomy Processes	Cognitive Psychology

QuTiP	Pandas	SciPy	Scikit-image	NetworkX	AstroPy	PsychoPy
PyQuil	statsmodels	PyWavelets	OpenCV	graph-tool	SunPy
Qiskit	Xarray	python-control	Mahotas	igraph	SpacePy
	Seaborn			PyGSP
Bioinformatics	Bayesian Inference	Mathematical Analysis	Chemistry	Geoscience	Geographic Processing	Architecture & Engineering

BioPython	PyStan	SciPy	Cantera	Pangeo	Shapely	COMPAS
Scikit-Bio	PyMC3	SymPy	MDAnalysis	Simpeg	GeoPandas	City Energy Analyst
PyEnsembl	ArviZ	cvxpy	RDKit	ObsPy	Folium	Sverchok
ETE	emcee	FEniCS		Fatiando a Terra

NumPy

提供基础的N维数组
SciPy library

科学计算基础包
Matplotlib

综合的2D图形包
IPython

增强的交互控制台
SymPy

符号数学

pandas

Data structures & analysis 数据结构化和分析工具

Pandas有数据类型 Series / DataFrame

pd.read_csv CSV《==》 DataFrame pd.to_csv

pd.DataFrame(str) || pd.read_json(str) pd.read_json JSON <==> DataFrame

数据DataFrame的简单操作：

import pandas as pd
import numpy as np
data=pd.DataFrame(pd.read_csv('data.csv',encoding='gbk'))
data=data.sort_values('采集时间',ascending=True)
data.head(10) # 前10个
data.loc[data['幅值(mv)']>220] # 过滤幅值大于220的
data.loc[(data['幅值(mv)']>220) & (data['采集时间']>'2021-01-30 22:40:47'),['设备','采集时间','幅值(mv)']].head()
data['幅值(mv)'].sum() # count求和 mean平均 

# 时间格式
data['date']=pd.to_datetime(data['采集时间'])

# 循环处理
for x in data.index:
    data.loc[x,'value']

JSON处理示例：

import pandas as pd
import json

# 使用 Python JSON 模块载入数据
with open('nested_list.json','r') as f:
    data = json.loads(f.read())

# 展平数据
df_nested_list = pd.json_normalize(data, record_path =['students']) // students字段嵌套 
print(df_nested_list)

26 KiB Raw Blame History

Anxinyun Analyze

Jupyter Nootbook

K8S API 鉴权

本地Micro-K8S

Python

标准库 官方中文文档目录

NumPy

SciPy library

Matplotlib

IPython

SymPy

pandas

26 KiB

Raw Blame History

标准库官方中文文档目录